Commit Graph

134 Commits

Author SHA1 Message Date
Keno Fischer
5417227e83 Use patchable rdtsc sequence to avoid slowdowns under rr
We (Julia) ship both support for using tracy to trace julia applications,
as well as using `rr` (https://github.com/rr-debugger/rr) for record-replay debugging.
After our most recent rebuild of tracy, users have been reporting signfificant performance
slowdowns when `rr` recording a session that happens to also load the tracy library
(even if tracing is not enabled). Upon further examination, the recompile happened
to trigger a protective heuristic that disabled rr's patching of tracy's use of
`rdtsc` because an earlier part of the same function happened to look like a
conditional branch into the patch region. See https://github.com/rr-debugger/rr/pull/3580
for details. To avoid this issue occurring again in future rebuilds of tracy,
adjust tracy's `rdtsc` sequence to be `nopl; rdtsc`, which (as of of the
linked PR) is a sequence that is guaranteed to bypass this heuristic
and not incur the additional overhead when run under rr.

This functionality is kept behind a compile-time flag `TRACY_PATCHABLE_NOPSLEDS`
in order to avoid polluting the instruction cache unnecessarily.
2023-09-20 20:21:40 -04:00
Vincent Loppin
98de2ed1c4 Remove warning (unused parameters) 2023-09-07 10:06:42 +02:00
Mathias Lang
c6d9741136 Fix and test TRACY_DEMANGLE for TracyClient
The configuration wasn't tested and stopped compiling.
This fixes it and adds a test to ensure it doesn't break again.
2023-09-06 01:28:58 +02:00
Björn Blissing
e1b1fd72dc Wrap std::numeric_limits<T>::max() in parenthesis
The windows.h header file defines the macro max. If the max macro is include
it will lead to name collisions with the std::numeric_limits<T>::max() function.

One solution is to define NOMINMAX before the inclusion of windows.h.
However, that might be a demanding task for a large codebase. Defining the
NOMINMAX as global define may also break previous code.

Another to solution to the problem is to wrap the numeric_limits function in
parenthesis to instruct the compiler to treat it as a function and not a macro.

This commit wraps the std::numeric_limits<T>::max() calls in the public
interfacing header files, with parenthesis.
2023-08-17 10:03:52 +02:00
menduz
e05545b04a
Fix mac compile for latest macos SDK
Following 3feb2473a2, `mach_vm.h` seems to have been deleted
2023-08-15 21:00:46 -03:00
Ястребков Дмитрий Ирикович
de45af63cc Fix compilation for the case of using TRACY_NO_CALLSTACK 2023-07-31 11:40:12 +07:00
Bartosz Taudul
72cfa3e0d1
Fix copy pasta. 2023-05-04 15:16:46 +02:00
Cody Tapscott
6249999153 linux: respect TRACY_NO_SAMPLING for sys-tracing
This compile-time flag was being ignored on Linux. This change adds
gating for software-sampled stack trace sampling following the same
pattern as other `TRACY_NO_SAMPLE_*` options.

If `TRACY_NO_SAMPLING=1` is provided as an environment variable,
software stack sampling is also disabled.
2023-04-04 17:22:31 -04:00
Bartosz Taudul
e2e55a77b5
Add TracySetProgramName() macro to set broadcast contents. 2023-03-30 21:51:00 +02:00
Bartosz Taudul
2971db21e3
Read and report power usage. 2023-03-10 00:23:09 +01:00
Bartosz Taudul
c3e7157cd5
Detect power domains. 2023-03-10 00:05:18 +01:00
Bartosz Taudul
5e2e5eeefb
Add system power use tracking skeleton. 2023-03-09 22:31:31 +01:00
Bartosz Taudul
7151c6afd9
Add support for configuring plots to C API. 2023-03-08 23:18:36 +01:00
Lectem
ecdf6adc32 Fix race condition for symbols resolution on windows
There might have been new modules loaded by another thread between the `SymInitialize` and `EnumProcessModules` calls.
Since we register the enumerated modules into the cache, we need to make sure that symbols for this module are loaded.
The only way to do that is to call `SymLoadModuleEx`, just like we do when finding new modules after `InitCallstack`.
2023-02-14 15:32:37 +01:00
John Plate
37bc03fd63 Fix MSVC compiler warning 2023-02-08 13:27:17 +00:00
mwl4
1439e93a69 Fix compilation on linux: always initialize ScopedZone::m_connectionId to 0.
gcc error:
public/tracy/../client/TracyScoped.hpp:102:9: error: ‘___tracy_scoped_zone.tracy::ScopedZone::m_connectionId’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if( GetProfiler().ConnectionId() != m_connectionId ) return;
         ^~
2023-01-25 15:11:49 +01:00
mwl4
789f572332 Fix compilation on linux: use abort() instead of assert( false ).
assert() in release configuration resolves to empty code, while abort() is marked as [[noreturn]] and always is available.

gcc error:
error: ‘type’ may be used uninitialized in this function [-Werror=maybe-uninitialized]:
public/tracy/../client/../common/TracyAlign.hpp: In function ‘void tracy::SysTraceWorker(void*)’:
public/tracy/../client/../common/TracyAlign.hpp:22:11: error: ‘type’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
     memcpy( ptr, &val, sizeof( T ) );
     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from public/TracyClient.cpp:26,
                 from X.cpp:
public/client/TracySysTrace.cpp:1258:35: note: ‘type’ was declared here
                         QueueType type;
                                   ^~~~
2023-01-25 15:08:51 +01:00
Bartosz Taudul
d47122586c
Fix color channel names in source location message. 2023-01-23 01:23:15 +01:00
Bartosz Taudul
6652999a60
Fix color channels names in zone color message. 2023-01-23 01:18:54 +01:00
Bartosz Taudul
eb7c13e7bb
Fix message color component names in the protocol.
Red and blue channels were mislabeled. Otherwise, coding and decoding was
performed correctly, as far as the color channel order described in the manual
is followed by the user.

No change to the binary protocol was made.
2023-01-23 01:07:07 +01:00
Bartosz Taudul
ecb4a0527a
Do not retrieve call stacks if sampling is not requested. 2023-01-05 21:45:44 +01:00
Bartosz Taudul
c12505c19d
Fix TRACY_NO_CALLSTACK on Windows. 2022-12-22 21:17:31 +01:00
Bartosz Taudul
ba6416f68c
Merge pull request #497 from pshurgal/detailed_function_names
More detailed function names under Clang and GCC
2022-11-30 12:14:46 +01:00
Petr Shurgalin
5ddf62b54e Customizable source location data 2022-11-30 12:39:50 +02:00
ReplayCoding
311ad7b061
Fix compile error when TRACY_NO_CRASH_HANDLER is enabled 2022-11-27 13:40:45 -08:00
Bartosz Taudul
2c03306341
Always provide Callstack() implementation, even if dummy.
This fixes usage with TRACY_HAS_CALLSTACK undefined, allowing compilation of
otherwise unused functions, which are already protected from being called
through macro redirections.

See https://github.com/wolfpld/tracy/pull/492 for more information.
2022-11-09 21:55:41 +01:00
Bartosz Taudul
6c74320b3e
Merge pull request #488 from xxxbxxx/master
Added typed plots variants to the C API
2022-11-03 19:28:04 +01:00
Bartosz Taudul
e7ac54fba6
Signals are only set if TRACY_NO_CRASH_HANDLER is not defined. 2022-11-03 17:21:09 +01:00
xxxbxxx
a0cb8eb1d5 Added typed plots variants to the C API 2022-11-03 13:47:04 +01:00
Bartosz Taudul
970468f937
Override dlclose() to do nothing.
Provide a custom no-op implementation of dlclose(), in order to prevent shared
object data from disappearing from profiler view. The server makes queries for
program executable code, which has to be always available, otherwise wrong
data may be provided, or the program may crash, due to referencing no longer
mapped memory.

The dlclose() documentation states that the function internally decreases the
reference count, and only does unload the shared object when the count reaches
zero. There is no guarantee that the shared object data will be unloaded
immediately after any dlclose call originating from the program. This function
override exploits this fact.
2022-10-28 01:21:52 +02:00
Bartosz Taudul
898140fbda
Don't read payload.extra, if not needed. 2022-10-28 00:34:46 +02:00
Bartosz Taudul
b88ef29792
Make sure source file data is properly tracked. 2022-10-13 19:00:22 +02:00
Bartosz Taudul
a85c0e18d2
Decouple source code retrieval from the profiler thread.
This will prevent apparent freezes of the profiler when debuginfod queries are
made.
2022-10-13 00:30:17 +02:00
Bartosz Taudul
6ca1c98655
Handle symbol thread crashes.
Should the symbol thread crash, mark that it is gone. This will allow the
profiler to transmit crash call stack, including resolved symbol names and
locations (which will resolve on the main profiler thread).
2022-10-13 00:30:17 +02:00
Bartosz Taudul
9657bdec72
Initialize rpmalloc properly in symbol worker. 2022-10-12 23:51:50 +02:00
Bartosz Taudul
4416dff342
Increase extra data in SymbolQueueItem to 64 bit. 2022-10-12 22:23:06 +02:00
Bartosz Taudul
f64fb95a77
Fix preprocesor condition. 2022-10-12 22:05:19 +02:00
Bartosz Taudul
8ca4bc761d
s_symbolTid is only available if crash handler is there. 2022-10-12 19:56:46 +02:00
Bartosz Taudul
a235dca7ea
Cleanup. 2022-10-12 01:42:22 +02:00
Bartosz Taudul
383ecb6a12
Remove CodeLocation query and CodeInformation response. 2022-10-11 22:56:23 +02:00
Bartosz Taudul
f509ed1561
Include PID in broadcast message. 2022-10-09 21:54:54 +02:00
Bartosz Taudul
f476e6a0f7
Ditto on windows. 2022-10-08 14:09:58 +02:00
Bartosz Taudul
2c289dbb84
Do not freeze symbol thread. 2022-10-08 14:08:31 +02:00
Bartosz Taudul
4399656e83
__GNUC__ version checks are not valid on clang. 2022-10-08 14:04:54 +02:00
Bartosz Taudul
2595f983e6
Include gcc patchlevel in compiler version report. 2022-10-08 14:04:54 +02:00
Bartosz Taudul
6f9dfc8469
Use dladdr, not libbacktrace in fast callstack decode path.
DecodeCallstackPtrFast() may be called outside the symbol processing thread,
for example in the crash handler. Using the less-capable dladdr functionality
doesn't have a big impact here. Callstack decoding in this context is used to
remove the uninteresting top part of the callstack, so that the callstack ends
at the crashing function, and not in the crash handler. Even if this
functionality would be impacted by this change, the damage done is close to
none.

The other alternative is to use locking each time a libbacktrace is to be
used, which does not seem to be worthy to do, considering that the problem
only occurs in a very rare code path.

NB everything was working when it was first implemented, because back then the
callstack decoding was still performed on the main thread, and not on a
separate, dedicated one.
2022-10-08 13:22:56 +02:00
Bartosz Taudul
7552341ff0
Increase possible inline stack size to 64 elements. 2022-10-04 22:16:20 +02:00
Bartosz Taudul
aa017e6a76
Merge pull request #468 from sherief/exception-handler-fix
Windows exception handler allows other handlers to be called.
2022-09-15 11:33:39 +02:00
Bartosz Taudul
0fc1c0f927
Make symbol thread exit status more robust. 2022-09-13 21:07:03 +02:00
Sherief Farouk
e8b3d22d76 Windows exception handler allows other handlers to be called.
The profiled app might install handlers to track crashes, write minidumps,
etc. - this patch makes sure the app's exception handler is called when
a crash happens while profiling with Tracy.
2022-09-10 17:16:58 -07:00
Pilzschaf
a55fd64a5b Added gpu zone begin non-alloc and callstack variants to the C API 2022-09-09 21:23:07 +02:00
Pilzschaf
41a1ac203b Added gpu calibration to the C API 2022-09-09 18:40:17 +02:00
Bartosz Taudul
2cc5eff9a2
Normalize symbol paths on libbacktrace systems. 2022-09-02 01:23:29 +02:00
Bartosz Taudul
8cc43284bd
Add path normalization function. 2022-09-02 01:23:14 +02:00
Robert Adam
ece8779362 Fix cpuid symbol redefinition on older GCC versions
Since commit 940f32c1a8 building the Tracy
library on Linux using a GCC version < 11 would result in compile errors
due to symbol redefinitions of __get_cpuid_max, __get_cpuid and
__get_cpuid_count.

This is because prior to GCC 11 the cpuid.h header file did not have any
include guards and thus including this header more than once would
produce the abovementioned errors.

To work around this issue, including cpuid.h has been wrapped into a
custom header file that itself uses include guards and thus shields
cpuid.h from being included multiple times.

Fixes #452
2022-08-31 17:59:46 +02:00
Bartosz Taudul
72b7d0db5b
Add user data pointer to parameter callback. 2022-08-26 00:46:01 +02:00
Bartosz Taudul
197007ab47
Keep a list of buffers left to handle.
Previously a bitmap of buffers was repeatedly scanned to see which buffers
still contain data. This process was needlessly wasting cycles (seen as a
hotspot when profiled) and worse yet, the workload increased with the number
of CPU cores (=> buffers used) to handle.

The new implementation instead maintains a list of buffer indices that have to
be handled. This list does not contain empty buffers, so each loop iteration
performs some work, instead of just spinning in search for buffers to handle.
2022-08-18 13:59:56 +02:00
Bartosz Taudul
940f32c1a8
Add include for cpuid. 2022-08-18 13:40:37 +02:00
Bartosz Taudul
07a56f1148
Load globals to local variables. 2022-08-18 01:08:22 +02:00
Bartosz Taudul
a237f108c7
Use source contents callback. 2022-08-17 16:04:20 +02:00
Bartosz Taudul
ed7be2faaa
Add source contents callback setup. 2022-08-17 16:04:18 +02:00
Bartosz Taudul
3dc542a464
Log invalid debuginfod queries.
Filename paths must be absolute, not relative.
2022-08-16 22:05:14 +02:00
Bartosz Taudul
d32dc47845
Add debug logging for debuginfod queries. 2022-08-16 22:05:08 +02:00
Bartosz Taudul
72ad40698b
Move initialization of callstack structs to a thread.
Initializing structures for callstack processing (building memory map of the
process, gathering kernel symbols, etc) takes some time, which in some cases
may be significant.

Callstack queries are now handled on a separate thread. In such setup it no
longer makes sense to block main thread execution with this lengthy init
process.

All the heavy initialization phase has been now moved to this separate
processing thread. Some initial callstack queries may now not produce
responses as promptly as before, but this is only because the main thread is
able to start working earlier.

Some parts of the initialization process may be critical to do in the main
thread, for example because the function responsible for gathering callstacks
must be loaded first. This is done still on the main thread, in a new function
InitCallstackCritical().
2022-08-16 13:55:46 +02:00
Bartosz Taudul
77e3a480a4
Properly terminate CPU model string. 2022-08-13 19:37:34 +02:00
Bartosz Taudul
bb22542a90
Decouple rpmalloc usage from TRACY_ENABLE flag. 2022-08-08 19:40:16 +02:00
Bartosz Taudul
c6464f44da
Fix typo. 2022-07-30 22:02:25 +02:00
Bartosz Taudul
e0f813d9e9
Add support for Vsync capture on Linux. 2022-07-30 21:29:44 +02:00
Bartosz Taudul
91b002267e
Emit dedicated Vsync frame messages. 2022-07-30 19:53:40 +02:00
Bartosz Taudul
86bc2020cb
Fix call to rpmalloc_thread_finalize in manual lifetime use-case. 2022-07-30 14:33:39 +02:00
Bartosz Taudul
52643fcd2a
Compatibility fixes for rpmalloc. 2022-07-30 14:13:54 +02:00
Bartosz Taudul
45ba4f2390
Disable auto-cleanup in rpmalloc. 2022-07-30 13:35:44 +02:00
Bartosz Taudul
6be10b6122
Update rpmalloc to 1.4.4. 2022-07-30 13:29:57 +02:00
Bartosz Taudul
f4b0654fcd
Allow setting plot color in the configuration message. 2022-07-24 13:32:21 +02:00
Bartosz Taudul
3f51409389
Add step and fill parameters to plot configuration. 2022-07-24 13:05:01 +02:00
Bartosz Taudul
810f1573ac
Use separate messages for transfer of different plot value types. 2022-07-24 13:00:36 +02:00
Bartosz Taudul
a75846dd88
Do not try to demangle really long function names. 2022-07-23 12:37:00 +02:00
Bartosz Taudul
d282425287
Fix demangle buffer. 2022-07-23 12:34:35 +02:00
Bartosz Taudul
7dc95bf3a8
Fix demangling of functions with names >64KB. 2022-07-20 01:21:43 +02:00
Bartosz Taudul
f52ac31468
Do not duplicate code. 2022-07-20 01:06:12 +02:00
Bartosz Taudul
a04f830962
Force inline string copy functions. 2022-07-20 01:04:59 +02:00
Bartosz Taudul
5e0eb7e26b
Drop string copy length asserts.
These might now truncate data.
2022-07-20 01:03:06 +02:00
Bartosz Taudul
440d20b864
Don't transfer lock release thread id.
We already know which thread is holding the lock.
2022-07-18 01:56:09 +02:00
Bartosz Taudul
06c7984a16
Move all client headers and sources to public/ directory. 2022-07-17 15:47:38 +02:00