Summary:
This change removes further cases where the profiling mode
implementation relied on dynamic memory allocation. We're using
thread-local aligned (uninitialized) memory instead, which we initialize
appropriately with placement new.
Addresses llvm.org/PR38577.
Reviewers: eizan, kpw
Subscribers: jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D51278
llvm-svn: 340814
Summary:
This change saves and restores the full flags register in x86_64 mode.
This makes running instrumented signal handlers safer, and avoids flags
set during the execution of the event handlers from polluting the
instrumented call's flags state.
Reviewers: kpw, eizan, jfb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51277
llvm-svn: 340812
Summary:
We avoid using dynamic memory allocated with the internal allocator in
the profile collection service used by profiling mode. We use aligned
storage for globals and in-struct storage of objects we dynamically
initialize.
We also remove the dependency on `Vector<...>` which also internally
uses the dynamic allocator in sanitizer_common (InternalAlloc) in favour
of the XRay allocator and segmented array implementation.
This change addresses llvm.org/PR38577.
Reviewers: eizan
Reviewed By: eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50782
llvm-svn: 339978
Summary:
This reverses an earlier decision to allow seg-faulting from the
XRay-allocated memory if it turns out that the system cannot provide
physical memory backing that cannot be swapped in/out on Linux.
This addresses http://llvm.org/PR38588.
Reviewers: eizan
Reviewed By: eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50831
llvm-svn: 339869
Summary:
This change provides access to the file header even in the in-memory
buffer processing. This allows in-memory processing of the buffers to
also check the version, and the format, of the profile data.
Reviewers: eizan, kpw
Reviewed By: eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50037
llvm-svn: 338347
Summary:
This change moves FDR mode to use `internal_mmap(...)` from
sanitizer_common instead of the internal allocator interface. We're
doing this to sidestep the alignment issues we encounter with the
`InternalAlloc(...)` functions returning pointers that have some magic
bytes at the beginning.
XRay copies bytes into the buffer memory, and does not require the magic
bytes tracking the other sanitizers use when allocating/deallocating
buffers.
Reviewers: kpw, eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49972
llvm-svn: 338228
This change makes it so that the profiling mode implementation will only
write files when there are buffers to write. Before this change, we'd
always open a file even if there were no profiles collected when
flushing.
llvm-svn: 337443
When providing raw access to the FDR mode buffers, we used to not
include the extents metadata record. This oversight means that
processing the buffers in-memory will lose important information that
would have been written in files.
This change exposes the metadata record by serializing the data
similarly to how we would do it when flushing to files.
llvm-svn: 337441
MAP_NORESERVE is not supported or a no-op on BSD.
Reviewers: dberris
Reviewed By: dberris
Differential Revision: https://reviews.llvm.org/D49494
llvm-svn: 337440
Summary:
This is a follow-on to D49217 which simplifies and optimises the
implementation of the segmented array. In this patch we co-locate the
book-keeping for segments in the `__xray::Array<T>` with the data it's
managing. We take the chance in this patch to actually rename `Chunk` to
`Segment` to better align with the high-level description of the
segmented array.
With measurements using benchmarks landed in D48879, we've identified
that calls to `pthread_getspecific` started dominating the cycles, which
led us to revert the change made in D49217 to use C++ thread_local
initialisation instead (it reduces the cost by a huge margin, since we
save one PLT-based call to pthread functions in the hot path). In
particular, this is in `__xray::getThreadLocalData()`.
We also took the opportunity to remove the least-common-multiple based
calculation and instead pack as much data into segments of the array.
This greatly simplifies the API of the container which hides as much of
the implementation details as possible. For instance, we calculate the
number of elements we need for the each segment internally in the Array
instead of making it part of the type.
With the changes here, we're able to get a measurable improvement on the
performance of profiling mode on top of what D48879 already provides.
Depends on D48879.
Reviewers: kpw, eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49363
llvm-svn: 337343
Summary:
This change simplifies the XRay Allocator implementation to self-manage
an mmap'ed memory segment instead of using the internal allocator
implementation in sanitizer_common.
We've found through benchmarks and profiling these benchmarks in D48879
that using the internal allocator in sanitizer_common introduces a
bottleneck on allocating memory through a central spinlock. This change
allows thread-local allocators to eliminate contention on the
centralized allocator.
To get the most benefit from this approach, we also use a managed
allocator for the chunk elements used by the segmented array
implementation. This gives us the chance to amortize the cost of
allocating memory when creating these internal segmented array data
structures.
We also took the opportunity to remove the preallocation argument from
the allocator API, simplifying the usage of the allocator throughout the
profiling implementation.
In this change we also tweak some of the flag values to reduce the
amount of maximum memory we use/need for each thread, when requesting
memory through mmap.
Depends on D48956.
Reviewers: kpw, eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49217
llvm-svn: 337342
Summary:
Fix a bug in FDR mode which didn't allow for re-initialising the logging
in the same process. This change ensures that:
- When we flush the FDR mode logging, that the state of the logging
implementation is `XRAY_LOG_UNINITIALIZED`.
- Fix up the thread-local initialisation to use aligned storage and
`pthread_getspecific` as well as `pthread_setspecific` for the
thread-specific data.
- Actually use the pointer provided to the thread-exit cleanup handling,
instead of assuming that the thread has thread-local data associated
with it, and reaching at thread-exit time.
In this change we also have an explicit test for two consecutive
sessions for FDR mode tracing, and ensuring both sessions succeed.
Reviewers: kpw, eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49359
llvm-svn: 337341
We no longer pass CLANG_DEFAULT_CXX_STDLIB to the runtimes build
as it was causing issues so we can no longer use this variable. We
instead use cxx-headers as a dependency whenever this is available
since both XRay and libFuzzer are built as static libraries so this
is sufficient.
Differential Revision: https://reviews.llvm.org/D49346
llvm-svn: 337199
Summary:
Fix a TODO in CMake config for XRay tests to use the detected C++ ABI
library in the tests.
Also make the tests depend on the llvm-xray target when built in-tree.
Reviewers: kpw, eizan
Reviewed By: eizan
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D49358
llvm-svn: 337142
Summary:
llvm-xray changes:
- account-mode - process-id {...} shows after thread-id
- convert-mode - process {...} shows after thread
- parses FDR and basic mode pid entries
- Checks version number for FDR log parsing.
Basic logging changes:
- Update header version from 2 -> 3
FDR logging changes:
- Update header version from 2 -> 3
- in writeBufferPreamble, there is an additional PID Metadata record (after thread id record and tsc record)
Test cases changes:
- fdr-mode.cc, fdr-single-thread.cc, fdr-thread-order.cc modified to catch process id output in the log.
Reviewers: dberris
Reviewed By: dberris
Subscribers: hiraditya, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D49153
llvm-svn: 336974
Summary:
This change adds support for writing out profiles at program exit.
Depends on D48653.
Reviewers: kpw, eizan
Reviewed By: kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48956
llvm-svn: 336969
The list duplicates information already available in the parent
directory so use that instead. It is unclear to me why we need
to spell out the dependencies explicitly but fixing that should
be done in a separate patch.
Differential Revision: https://reviews.llvm.org/D49177
llvm-svn: 336905
Summary: XRayRecords now includes a PID field. Basic handlers fetch pid and tid each time they are called instead of caching the value. Added a testcase that calls fork and checks if the child TID is different from the parent TID to verify that the processes' TID are different in the trace.
Reviewers: dberris, Maknee
Reviewed By: dberris, Maknee
Subscribers: kpw, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D49025
llvm-svn: 336769
It turns out that the `${XRAY_HEADERS}` CMake variable was already
in use and was used for public headers. It seems that
`lib/xray/tests/CMakeLists.txt` was depending on this.
To fix rename the new `${XRAY_HEADERS}` to `${XRAY_IMPL_HEADERS}`.
llvm-svn: 336699
when building with an IDE so that header files show up in the UI.
This massively improves the development workflow in IDEs.
To implement this a new function `compiler_rt_process_sources(...)` has
been added that adds header files to the list of sources when the
generator is an IDE. For non-IDE generators (e.g. Ninja/Makefile) no
changes are made to the list of source files.
The function can be passed a list of headers via the
`ADDITIONAL_HEADERS` argument. For each runtime library a list of
explicit header files has been added and passed via
`ADDITIONAL_HEADERS`. For `tsan` and `sanitizer_common` a list of
headers was already present but it was stale and has been updated
to reflect the current state of the source tree.
The original version of this patch used file globbing (`*.{h,inc,def}`)
to find the headers but the approach was changed due to this being a
CMake anti-pattern (if the list of headers changes CMake won't
automatically re-generate if globbing is used).
The LLVM repo contains a similar function named `llvm_process_sources()`
but we don't use it here for several reasons:
* It depends on the `LLVM_ENABLE_OPTION` cache variable which is
not set in standalone compiler-rt builds.
* We would have to `include(LLVMProcessSources)` which I'd like to
avoid because it would include a bunch of stuff we don't need.
Differential Revision: https://reviews.llvm.org/D48422
llvm-svn: 336663
Changes:
- Remove static assertion on size of a structure, fails on systems where
pointers aren't 8 bytes.
- Use size_t instead of deducing type of arguments to
`nearest_boundary`.
Follow-up to D48653.
llvm-svn: 336648
Summary:
We found a bug while working on a benchmark for the profiling mode which
manifests as a segmentation fault in the profiling handler's
implementation. This change adds unit tests which replicate the
issues in isolation.
We've tracked this down as a bug in the implementation of the Freelist
in the `xray::Array` type. This happens when we trim the array by a
number of elements, where we've been incorrectly assigning pointers for
the links in the freelist of chunk nodes. We've taken the chance to add
more debug-only assertions to the code path and allow us to verify these
assumptions in debug builds.
In the process, we also took the opportunity to use iterators to
implement both `front()` and `back()` which exposes a bug in the
iterator decrement operation. In particular, when we decrement past a
chunk size boundary, we end up moving too far back and reaching the
`SentinelChunk` prematurely.
This change unblocks us to allow for contributing the non-crashing
version of the benchmarks in the test-suite as well.
Reviewers: kpw
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48653
llvm-svn: 336644
Summary:
Without this patch,
clang -fsanitize=address -xc =(printf 'int main(){}') -o a; ./a => deadlock in __asan_init>AsanInitInternal>AsanTSDInit>...>__getcontextx_size>_rtld_bind>rlock_acquire(rtld_bind_lock, &lockstate)
libexec/rtld-elf/rtld.c
wlock_acquire(rtld_bind_lock, &lockstate);
if (obj_main->crt_no_init)
preinit_main(); // unresolved PLT functions cannot be called here
lib/libthr/thread/thr_rtld.c
uc_len = __getcontextx_size(); // unresolved PLT function in libthr.so.3
check-xray tests currently rely on .preinit_array so we special case in
xray_init.cc
Subscribers: srhines, kubamracek, krytarowski, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D48806
llvm-svn: 336067
When XRay is being built as part of the just built compiler together
with libc++ as part of the runtimes build, we need an explicit
dependency from XRay to libc++ to make sure that the library is
available by the time we start building XRay.
Differential Revision: https://reviews.llvm.org/D48113
llvm-svn: 334575
Summary:
This is part of the larger XRay Profiling Mode effort.
This patch implements the profile writing mechanism, to allow profiles
collected through the profiler mode to be persisted to files.
Follow-on patches would allow us to load these profiles and start
converting/analysing them through the `llvm-xray` tool.
Depends on D44620.
Reviewers: echristo, kpw, pelikan
Reviewed By: kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45998
llvm-svn: 334472
Summary:
This is part of the larger XRay Profiling Mode effort.
This patch implements the wiring required to enable us to actually
select the `xray-profiling` mode, and install the handlers to start
measuring the time and frequency of the function calls in call stacks.
The current way to get the profile information is by working with the
XRay API to `__xray_process_buffers(...)`.
In subsequent changes we'll implement profile saving to files, similar
to how the FDR and basic modes operate, as well as means for converting
this format into those that can be loaded/visualised as flame graphs. We
will also be extending the accounting tool in LLVM to support
stack-based function call accounting.
We also continue with the implementation to support building small
histograms of latencies for the `FunctionCallTrie::Node` type, to allow
us to actually approximate the distribution of latencies per function.
Depends on D45758 and D46998.
Reviewers: eizan, kpw, pelikan
Reviewed By: kpw
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D44620
llvm-svn: 334469
This change uses 'const' for the retryingWriteAll(...) API and removes
unnecessary 'static' local variables in getting the temporary filename.
llvm-svn: 334267
Summary:
This fixes http://llvm.org/PR32274.
This change adds a test to ensure that we're able to link XRay modes and
the runtime to binaries that don't need to depend on the C++ standard
library or a C++ ABI library. In particular, we ensure that this will work
with C programs compiled+linked with XRay.
To make the test pass, we need to change a few things in the XRay
runtime implementations to remove the reliance on C++ ABI features. In
particular, we change the thread-safe function-local-static
initialisation to use pthread_* instead of the C++ features that ensure
non-trivial thread-local/function-local-static initialisation.
Depends on D47696.
Reviewers: dblaikie, jfb, kpw, eizan
Reviewed By: kpw
Subscribers: echristo, eizan, kpw, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D46998
llvm-svn: 334262
Summary:
This change extracts the recursion guard implementation from FDR Mode
and updates it to do the following:
- Do the atomic operation correctly to be signal-handler safe.
- Make it usable in both FDR and Basic Modes.
Before this change, the recursion guard relied on an unsynchronised read
and write on a volatile thread-local. A signal handler could then run in
between the read and the write, and then be able to run instrumented
code as part of the signal handling. Using an atomic exchange instead
fixes that by doing a proper mutual exclusion even in the presence of
signal handling.
Reviewers: kpw, eizan, jfb
Reviewed By: eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47696
llvm-svn: 334064
We don't actually need to support multiple definitions of the functions
in FDR mode, but rather want to make sure that the implementation-detail
functions are marked as 'static' instead. This allows the inliner to do
its magic better for these functions too, since inline functions must
have a unique address across translation units.
llvm-svn: 334001
We planned to have FDR mode's internals unit-tested but it turns out
that we can just use end-to-end testing to verify the implementation.
We're going to move towards that approach more and more going forward,
so we're merging the implementation details of FDR mode into a single
.cc file.
We also avoid globbing in the XRay test helper macro, and instead list
down the files from the lib directory.
llvm-svn: 333986
Summary:
This is part of the work to address http://llvm.org/PR32274.
We remove the calls to array-placement-new and array-delete. This allows
us to rely on the internal memory management provided by
sanitizer_common/sanitizer_internal_allocator.h.
Reviewers: eizan, kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47695
llvm-svn: 333982
XRay doesn't use RTTI and doesn't need it. We disable it explicitly in
the CMake config, similar to how the other sanitizers already do it.
Part of the work to address http://llvm.org/PR32274.
llvm-svn: 333867
Summary:
This is part of the larger XRay Profiling Mode effort.
This patch implements a centralised collector for `FunctionCallTrie`
instances, associated per thread. It maintains a global set of trie
instances which can be retrieved through the XRay API for processing
in-memory buffers (when registered). Future changes will include the
wiring to implement the actual profiling mode implementation.
This central service provides the following functionality:
* Posting a `FunctionCallTrie` associated with a thread, to the central
list of tries.
* Serializing all the posted `FunctionCallTrie` instances into
in-memory buffers.
* Resetting the global state of the serialized buffers and tries.
Depends on D45757.
Reviewers: echristo, pelikan, kpw
Reviewed By: kpw
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D45758
llvm-svn: 333624
Summary:
This is part of the larger XRay Profiling Mode effort.
This patch implements a central data structure for capturing statistics
about XRay instrumented function call stacks. The `FunctionCallTrie`
type does the following things:
* It keeps track of a shadow function call stack of XRay instrumented
functions as they are entered (function enter event) and as they are
exited (function exit event).
* When a function is entered, the shadow stack contains information
about the entry TSC, and updates the trie (or prefix tree)
representing the current function call stack. If we haven't
encountered this function call before, this creates a unique node for
the function in this position on the stack. We update the list of
callees of the parent function as well to reflect this newly found
path.
* When a function is exited, we compute statistics (TSC deltas,
function call count frequency) for the associated function(s) up the
stack as we unwind to find the matching entry event.
This builds upon the XRay `Allocator` and `Array` types in Part 1 of
this series of patches.
Depends on D45756.
Reviewers: echristo, pelikan, kpw
Reviewed By: kpw
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D45757
llvm-svn: 332313