Commit Graph

1103 Commits

Author SHA1 Message Date
Bartosz Taudul
ed50447f7a
Use alias for s_ring[i]. 2021-06-12 13:13:53 +02:00
Bartosz Taudul
f4c95eb021
Load linux kernel symbols list. 2021-06-11 01:31:02 +02:00
Bartosz Taudul
ca2130d56c
Process all data available in ring buffers. 2021-06-10 02:07:29 +02:00
Bartosz Taudul
5b7cd06840
Don't init rpmalloc, if we know it has been done already. 2021-06-10 01:48:11 +02:00
Bartosz Taudul
9c2ea8a71f
Specify minimum length of subframe skip in one place. 2021-06-09 02:13:00 +02:00
Bartosz Taudul
e1c3babf43
Check whole ignore list on ARM. 2021-06-09 02:06:28 +02:00
Bartosz Taudul
c79bfa349f
Update ARM CPU parts list. 2021-06-04 19:41:59 +02:00
Bartosz Taudul
ecf0e3d587
Update apple devices list. 2021-06-04 19:16:51 +02:00
Bartosz Taudul
859aa3b2b0
Setup system tracing before launching profiler threads.
This makes sure that profiler threads are properly included in sample data on
Linux. This was previously working because sample capture was performed
system-wide. Now samples are only captured in client context, which includes
all spawned threads. Since this inclusion only works for threads which will be
spawned after the trace starts, no thread can be created before sampling setup
is done.
2021-06-04 14:07:46 +02:00
Bartosz Taudul
2765be92fb
Sample time for hardware samples. 2021-06-04 12:50:55 +02:00
Bartosz Taudul
1616411257
Use AVX2 to search in strings with 32-byte blocks. 2021-06-03 13:49:38 +02:00
Bartosz Taudul
348582d6e4
Search for string matches with 8-byte blocks. 2021-06-03 13:10:26 +02:00
Bartosz Taudul
f8bb24ed36
Search for a character in string in 8-byte blocks. 2021-06-03 12:47:00 +02:00
Bartosz Taudul
483d31c1f4
Ringbuffer tail is not updated by kernel. 2021-06-03 01:14:44 +02:00
Bartosz Taudul
eb38640114
Fix uninitialized pointer. 2021-06-03 00:53:49 +02:00
Bartosz Taudul
b79014f3ee
Optimize parsing numbers.
Don't read byte-by-byte, process data in 8-byte packets.
2021-06-03 00:27:01 +02:00
Bartosz Taudul
b7c5939bb8
Merge remote-tracking branch 'origin/master' into hw 2021-06-02 01:12:28 +02:00
Bartosz Taudul
f4d80a4f5f
Fix rpmalloc init for TRACY_MANUAL_LIFETIME path. 2021-05-31 22:55:30 +02:00
Bartosz Taudul
3da84d1579
Hide rpmalloc init behind thread local boolean. 2021-05-31 22:38:22 +02:00
Bartosz Taudul
b0fc0d5dcc
Check if rpmalloc has to be initialized before each operation.
The C++11 spec states in [basic.stc.thread] thread storage duration:

2. A variable with thread storage duration shall be initialized before its
   first odr-use (3.2) and, if constructed, shall be destroyed on thread exit.

Previously Tracy relied on the TLS data being initialized:
- During thread creation (MSVC).
- Or during first use in a thread, but the initialization was performed for
  the whole TLS block.

It seems that new compilers are more granular with how they perform the
initialization, hence rpmalloc init has to be checked before each allocation,
as it cannot be "folded" into, for example, initialization of the profiler
itself.
2021-05-31 02:31:42 +02:00
Bartosz Taudul
92fb197aac
Use weak compare, yield thread. 2021-05-31 02:22:13 +02:00
Bartosz Taudul
3feb2473a2
Fix rpmalloc on ios.
https://github.com/mjansson/rpmalloc/issues/146
2021-05-30 13:38:29 +02:00
Bartosz Taudul
4fc02f5680
Ignore intrinsic wrappers during code location resolution. 2021-05-24 00:02:44 +02:00
Bartosz Taudul
cfb6d0d2ae
Timestamp conversion might be temporarily unavailable. 2021-05-23 20:32:09 +02:00
Bartosz Taudul
233a0bb6d6
Set precise_ip to 0 for cache on Intel.
Fuck knows how this is supposed to work. perf_event_open() opens the
descriptor successfully, but it produces no samples, if precise_ip is not 0.
There are no such problems on ARM (where precise_ip is 3, but maybe it is not
supported at all on that architecture, again, fuck knows if), and on AMD
perf_event_open() does not succeed when precise_ip > 0.
2021-05-23 19:45:13 +02:00
Bartosz Taudul
b2d5fe8e1f
Reduce sampling frequency. 2021-05-23 19:03:11 +02:00
Bartosz Taudul
b1e4d16537
PIDs are no longer needed in samples. 2021-05-23 19:00:45 +02:00
Bartosz Taudul
bbd1c4505c
Restrict perf to return events only for the current PID. 2021-05-23 18:53:09 +02:00
Bartosz Taudul
4ad6f682c8
Exclude VM-related stuff. 2021-05-23 18:44:16 +02:00
Bartosz Taudul
fece23a32b
Set frequency, not period.
This enables sampling on ARM dev board.
2021-05-23 18:02:06 +02:00
Bartosz Taudul
7d3119cbac
Remove irrelevant flag. 2021-05-23 18:01:18 +02:00
Bartosz Taudul
34ca6d865e
Sample branches and cache more frequently. 2021-05-22 02:28:32 +02:00
Bartosz Taudul
c7026cbc1f
Reduce hw sample period. 2021-05-22 02:27:34 +02:00
Bartosz Taudul
089eda0be9
Precise_ip should be shared in each pair of counters. 2021-05-22 02:16:49 +02:00
Bartosz Taudul
fef507dfa2
Merge remote-tracking branch 'origin/master' into hw 2021-05-22 02:05:47 +02:00
Bartosz Taudul
68948712b4
Don't sleep if queues are empty, but there's queries to handle. 2021-05-22 01:12:42 +02:00
Bartosz Taudul
1e6aedf9e6
Limit client query response rate.
Original idea by xavier <xavierb@gmail.com>
2021-05-22 01:05:06 +02:00
Bartosz Taudul
d7541bbdba
Allow disabling inline resolution on windows.
Original commit a6b25497 by xavier <xavierb@gmail.com>:

add TRACY_CALLSTACK_IGNORE_INLINES to tradeoff speed vs precision in win32 DecodeCallstackPtr()

SymQueryInlineTrace() is too slow in some cases:
300000 queries backlog getting processed at ~70 per second is prohibitive.

(without inlines resolution, it's more like ~20000 queries per second)
2021-05-21 22:27:35 +02:00
Bartosz Taudul
8ec08465ee
Add debug messages to perf event setup. 2021-05-21 01:47:45 +02:00
Bartosz Taudul
afcebb6e6a
Add debug print macros. 2021-05-21 01:47:31 +02:00
Bartosz Taudul
4d668741eb
Probe for acceptable precise_ip value.
This is stupid, but it's exactly what perf does... Sigh.
2021-05-21 01:33:37 +02:00
Bartosz Taudul
bcb7b94272
Tid is not needed. 2021-05-20 02:39:22 +02:00
Bartosz Taudul
f0f3babacf
Set correct message types. 2021-05-20 02:27:36 +02:00
Bartosz Taudul
5f3d1c0faf
Sample cache and branch stats. 2021-05-20 02:15:23 +02:00
Bartosz Taudul
faf87809d7
Reduce hw sampling rate. 2021-05-20 01:48:52 +02:00
Bartosz Taudul
741de5c8fb
Allow disabling cycle/retirement sampling. 2021-05-19 23:38:32 +02:00
Bartosz Taudul
2e38e70049
Reduce hardware sampling perdiod. Don't sample time. 2021-05-19 23:21:21 +02:00
Bartosz Taudul
101cdd9b4b
Don't send thread id for hw samples. 2021-05-19 22:52:13 +02:00
Bartosz Taudul
7794443453
Collect CPU cycles and instruction retirement events. 2021-05-19 21:09:55 +02:00
Bartosz Taudul
16101571e0
Close perf_event file descriptor on exec. 2021-05-19 21:09:55 +02:00
Bartosz Taudul
9cd1b26bc7
Keep count of ring buffers separate from number of CPUs. 2021-05-19 21:09:55 +02:00
Bartosz Taudul
b7d52d2eab
Store RingBuffer identifier. 2021-05-19 21:09:52 +02:00
Bartosz Taudul
42a272edf5
Allow control of sampling frequency. 2021-05-11 18:31:20 +02:00
Bartosz Taudul
a6c6943a6c
Check if GetThreadDescription() is supported.
This functionality is available since Win 10 1607.
2021-05-04 16:13:42 +02:00
Bartosz Taudul
eb7d220eea
Added support for TRACY_NO_FRAME_IMAGE define. 2021-04-29 20:55:16 +02:00
Bartosz Taudul
56f0bdd571
ARM doesn't follow x64 canonical address requirements. 2021-04-29 18:24:37 +02:00
Bartosz Taudul
505656df5a
Trace frame count may be zero. 2021-04-29 18:24:37 +02:00
JW
915693ac39 Use tracy_malloc rather than 'new' in ProfilerThreadDataKey
This codepath, involving a workaround for GCC < 8.4, called 'new' and
'delete' directly, which could cause infinite recursion when
user-provided versions of those functions were themselves using Tracy
functionality.

Now, this codepath uses Tracy's internal allocator.

See issues #194, #196
2021-04-12 10:06:35 -07:00
Bartosz Taudul
40efbe8529
Use rpmalloc for initialization-related allocations. 2021-04-10 13:02:32 +02:00
Bartosz Taudul
2bb5d126fd
rpmalloc_thread_initialize is called in RPMallocInit. 2021-04-10 12:55:00 +02:00
joshuakr
fa942d18fe Fix spacing 2021-04-09 15:35:44 -07:00
joshuakr
e845c23493 Removed duplicate function 2021-04-09 15:35:07 -07:00
joshuakr
3fad55d7bc Missed one 2021-04-09 15:34:21 -07:00
joshuakr
eac23cead2 PR feedback 2021-04-09 15:33:01 -07:00
Eric van Beurden
fc142b4f9c fixed a build break on AARCH64. 2021-04-09 13:50:35 -07:00
Eric van Beurden
00ac6d1d8e worked around Windows broken getenv() call. 2021-04-09 13:50:02 -07:00
Joshua Kriegshauser
76a02205f3 Fix centos shutdown crash 2021-04-09 11:58:34 -07:00
Bartosz Taudul
c288a7903b
Make {Startup,Shutdown}Profiler() signatures consistent. 2021-03-08 02:39:51 +01:00
Bartosz Taudul
99c6b91c0c
Fix sending GPU context name in on-demand mode. 2021-02-27 19:59:32 +01:00
Bartosz Taudul
c12de1b326
Merge pull request #178 from sideeffects/master
Add IsActive accessor to ScopedZone.
2021-02-16 20:52:29 +01:00
John Lynch
29af8352ee Add IsActive accessor to ScopedZone. 2021-02-12 20:30:43 -06:00
Bartosz Taudul
5ea71ea20d
Apparently program_invocation_short_name may be not defined. 2021-02-11 18:12:59 +01:00
Bartosz Taudul
26a8ec3909 Reuse existing variable. 2021-02-10 18:56:07 +01:00
Bartosz Taudul
5e48eebf26 Fix type in comparison. 2021-02-07 21:08:24 +01:00
Bartosz Taudul
9cfc36f92e Preserve valid order of server query acknowledgements. 2021-02-07 20:53:20 +01:00
Bartosz Taudul
99e68715c7 Include SIGABRT in crash handling. 2021-02-07 19:04:48 +01:00
Bartosz Taudul
ad2062fb40 Last-resort source code transfer from client to server. 2021-02-04 00:45:00 +01:00
Bartosz Taudul
3376620919 Move server query acknowledgement to a separate function. 2021-02-04 00:03:58 +01:00
Bartosz Taudul
f97223e394 Rename ParamPingback to more generic AckServerQueryNoop. 2021-02-04 00:03:58 +01:00
Bartosz Taudul
253c3ae4c8 Android applications spawn through a common executable.
/proc/self/exe -> /system/bin/app_process64
2021-02-01 15:26:34 +01:00
Bartosz Taudul
f89fd4ab04 Executable path discovery on BSDs. 2021-01-31 20:27:32 +01:00
Bartosz Taudul
a795c21962 Get process executable path on macos. 2021-01-31 19:34:39 +01:00
Bartosz Taudul
7f5810dfd6 Add GPU context name transfer to the protocol. 2021-01-31 18:46:42 +01:00
Bartosz Taudul
c92974d920 Send executable mtime in welcome message. 2021-01-31 17:45:31 +01:00
Bartosz Taudul
0ce113a96c Check mtime of profiled executable. 2021-01-31 17:45:31 +01:00
Bartosz Taudul
2890f24c97 Implement getting process executable path. 2021-01-31 17:37:54 +01:00
Bartosz Taudul
a3bfbab6bd Fix timer setup for fallback timer. 2021-01-29 11:20:23 +01:00
Bartosz Taudul
33ca38b581 Add a define for fallback timer usage. 2021-01-28 18:49:17 +01:00
Bartosz Taudul
b58358f81f Cosmetics. 2021-01-28 18:49:12 +01:00
Bartosz Taudul
6b276a1a64 Init rpmalloc thread-local data when sending messages.
There was a possibility of having uninitialized TLS block there, if the first
thing done in a thread was sending a message.
2021-01-27 02:14:23 +01:00
Bartosz Taudul
90de2d2f73 Support queuing serial items with callstack. 2021-01-15 22:11:34 +01:00
Bartosz Taudul
d4c0d4fbb7 Rename CallstackMemory to CallstackSerial. 2021-01-15 20:49:39 +01:00
Bartosz Taudul
5a8d30ddc3 Add transient OpenGL zones. 2021-01-15 20:13:09 +01:00
Bartosz Taudul
3d37c686cf Mark rprealloc as a part of Tracy API. 2020-12-27 14:11:45 +01:00
Bartosz Taudul
a467ef4c2b Expose rpmalloc init/finalize functions. 2020-12-26 14:57:54 +01:00
Bartosz Taudul
1a1df0229d Add missing include. 2020-12-26 14:48:31 +01:00
Bartosz Taudul
dab68b2f21 Manually initialize GUID structs. 2020-12-21 16:13:59 +01:00
Bartosz Taudul
063ad1f1d3 Check return value of EnableTraceEx2(). 2020-12-21 15:41:01 +01:00
Bartosz Taudul
2049332211 Broadcast to localhost if listening only on localhost. 2020-12-16 15:27:00 +01:00
AWoloszyn
064d264445 Fix switch for memory free.
Because of the layout difference between messageFat and
messageColorFat, this was referencing the text member
3-bytes offset from where it should have been.
2020-12-07 22:07:12 -05:00
Ben Vanik
7dfdad2e02 Adding ZoneColor to set a dynamic color override to an existing zone. 2020-11-27 20:12:24 +01:00
bjacob
dfdf70aea3
Fix shutdown with TRACY_NO_EXIT=1 on Android. (#134) 2020-11-26 20:33:54 +01:00
Benoit Jacob
fc8ef12a78 fix condition in LookUpMapping, and some cosmetic fixes 2020-11-23 11:56:08 -05:00
Benoit Jacob
d787636804 remove some useless inline keywords 2020-11-22 09:49:46 -05:00
bjacob
d05641d70c
Ensure that mappings have read permission before decoding symbols and reading code. (#129) 2020-11-21 21:05:39 +01:00
Bartosz Taudul
9facbe848c
Merge pull request #128 from philix/int_fixes
Fix integer type warnings
2020-11-19 17:17:08 +01:00
Felipe Oliveira Carvalho
c9865c5f95 Fix integer type warnings
This is necessary to compile Tracy-instrumented code in
codebases built with -Werror.
2020-11-19 16:36:01 +01:00
Bartosz Taudul
119e357dbf Improve parsing of kernel tracing data. 2020-11-19 11:37:05 +01:00
bjacob
3fe4e7c3a7
Fix sampling on Android with default su command. (#123) 2020-11-17 21:11:48 +01:00
Bartosz Taudul
09203905d6 Support memory pools in the C API. 2020-11-15 15:23:22 +01:00
Bartosz Taudul
e920b5cf64 Allow disabling call stack sampling.
Only on Windows for now.
2020-11-05 23:59:52 +01:00
Bartosz Taudul
4caaa325c2 Allow disabling context switch tracing.
Currently only on Windows.
2020-11-05 23:56:19 +01:00
Bartosz Taudul
a34abe646c Allow disabling vsync capture. 2020-11-05 23:44:28 +01:00
Bartosz Taudul
8b4e03486d Remove trailing whitespace. 2020-10-29 23:06:28 +01:00
Bartosz Taudul
e2515c6a99 Remove pre-C++11 compat macros from concurrentqueue. 2020-10-29 23:05:24 +01:00
Bartosz Taudul
3d2ff4ffd1 Add support for user-provided dbghelp locks. 2020-10-28 20:04:37 +01:00
Bartosz Taudul
d75503047c Test whole call stack for non-canonical pointers. 2020-10-06 18:27:14 +02:00
Bartosz Taudul
fc1b03d67d Remove non-canonical pointer at the end of sampled stack. 2020-10-02 22:14:33 +02:00
Bartosz Taudul
bd31e3d2d6 Send callstacks before sending events they belong to. 2020-09-29 16:40:19 +02:00
Bartosz Taudul
8eb51aa01d Get LFQ item before capturing callstack.
This is to ensure that thread local structures have been properly
initialized (lock-free queue buffers are thread local), as capturing
callstack involves allocating memory from rpmalloc, which must be
initialized in each thread before allocation.
2020-09-29 15:10:55 +02:00
Bartosz Taudul
24f25751ce Prevent move and copy of ScopedZone. 2020-09-24 01:31:24 +02:00
Bartosz Taudul
4db092437c Add support for custom allocator tracking to client. 2020-09-24 01:31:23 +02:00
Bartosz Taudul
593ce74042 Notify servers that client is no longer listening for connections.
This happens in these two cases:
- The client is exiting.
- A connection attempt is performed.

This message type is indicated by negative time value.
2020-09-20 22:20:33 +02:00
Bartosz Taudul
5c826c2723 Send signed active time in broadcast message.
This allows special treatment of negative values.
2020-09-20 22:15:10 +02:00
Bartosz Taudul
1e34b22a82 There's no QPC on non-windows systems. 2020-09-20 19:47:00 +02:00
Bartosz Taudul
24c834bf8c Don't set sample_max_stack with kernel headers < 4.8. 2020-09-15 17:38:35 +02:00
Niclas Olmenius
607f988d1a Remove unnecessary defines in concurrentqueue
Remove defines related to exceptions in
`tracy_concurrentqueue.h` as they are not used anywhere
2020-09-15 11:41:39 +02:00
Bartosz Taudul
4dae36cb73 Don't use EVENT_FILTER_EVENT_ID, etc with SDK < Win 8.1. 2020-09-06 14:02:18 +02:00
Bartosz Taudul
9e51e3fa85 Remove unused variable. 2020-09-06 14:02:12 +02:00
Bartosz Taudul
f3eabc28e2 Don't handle crashes, if there's no connection. 2020-08-28 17:21:52 +02:00
Bartosz Taudul
960c7fb1b9 Don't alias struct names in client and server. 2020-08-20 17:38:29 +02:00
Bartosz Taudul
411ca81786 Don't operate on reference. 2020-08-18 21:36:09 +02:00
Bartosz Taudul
818d20d273 Don't use image name as a replacement for source file.
Image name is now reported separately.
2020-08-18 20:34:11 +02:00
Bartosz Taudul
9ba7381030 Small speedup for ReadNumber(). 2020-08-18 20:07:15 +02:00
Bartosz Taudul
4d4b6c7ac9 Use memchr() to find newline in memory block. 2020-08-18 19:51:02 +02:00
Bartosz Taudul
c53a1f1dac Extend ScopedZone to allow allocated srcloc construction. 2020-08-16 15:52:27 +02:00
Bartosz Taudul
5239b706c3 Allow disabling code transfer. 2020-08-16 01:31:54 +02:00
Bartosz Taudul
518ce1e946 No need to store two same pointers. 2020-08-15 13:40:36 +02:00
Bartosz Taudul
28aae73f74 RingBuffer has const size, so use template.
This eliminates division.
2020-08-15 02:43:18 +02:00
Bartosz Taudul
16ad6ee2ac Tune number of ETW kernel buffers. 2020-08-13 15:26:36 +02:00
Bartosz Taudul
9258e2ced0 Restore TSC usage on Linux. 2020-08-13 01:41:05 +02:00
Bartosz Taudul
c0c9832713 Implement TSC conversion and caps checking in ring buffer. 2020-08-13 01:40:18 +02:00
Bartosz Taudul
98fe63b5eb Increase sampling frequency to 10 kHz.
Works fine on bare metal.
2020-08-12 22:18:59 +02:00
Bartosz Taudul
f7574c5adc Reduce ring buffer size to workaround sigbus on android. 2020-08-12 18:46:19 +02:00
Bartosz Taudul
649994706b Use clock monotonic raw on Linux.
Because Linux kernel interfaces are fucking stupid.
2020-08-12 16:49:30 +02:00
Bartosz Taudul
d48b3187b1 Call stack sampling using perf events. 2020-08-12 16:49:30 +02:00
Bartosz Taudul
c16200ac02 Add ring buffer for perf events. 2020-08-12 14:06:19 +02:00
Bartosz Taudul
90ed18222a Use proper allocator. 2020-08-12 01:30:22 +02:00
Bartosz Taudul
b1b7be0a46 Adjust kernel tracing threads priorities. 2020-08-12 01:27:59 +02:00
Bartosz Taudul
1f4bfb68a0 Increase Linux sys trace per-cpu buffer size to 4 MB. 2020-08-12 01:19:10 +02:00