Bartosz Taudul
1f0c18882c
Don't collect sys time after application has exited.
2019-10-29 23:05:14 +01:00
Bartosz Taudul
0f2503d334
Send time deltas in GPU time events.
2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176
Send delta times for context switches.
2019-10-25 19:13:11 +02:00
Bartosz Taudul
25b3cdc1ee
Send thread wakeups when handling disconnect request.
2019-10-25 18:22:42 +02:00
Bartosz Taudul
04b132b6e2
Check if requested data size doesn't overflow buffer.
2019-10-24 21:22:22 +02:00
Bartosz Taudul
ba61a9ed84
Transfer time deltas, not absolute times.
...
This change significantly reduces network bandwidth requirements.
Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
cf88265304
Full 64-bit register is set by rdtsc.
2019-10-21 01:13:55 +02:00
Bartosz Taudul
07b66cd4ab
Move fake source location out of loop.
2019-10-20 22:18:05 +02:00
Bartosz Taudul
909503403b
Simplify delay calibration.
2019-10-20 22:13:29 +02:00
Bartosz Taudul
c774534b47
Use rdtsc instead of rdtscp.
...
But rdtscp is serializing!
No, it's not. Quoting the Intel Instruction Set Reference:
"The RDTSCP instruction is not a serializing instruction, but it does
wait until all previous instructions have executed and all previous
loads are globally visible. But it does not wait for previous stores to
be globally visible, and subsequent instructions may begin execution
before the read operation is performed.",
"The RDTSC instruction is not a serializing instruction. It does not
necessarily wait until all previous instructions have been executed
before reading the counter. Similarly, subsequent instructions may begin
execution before the read operation is performed."
So, the difference is in waiting for prior instructions to finish
executing. Notice that even in the rdtscp case, execution of the
following instructions may commence before time measurement is finished
and data stores may be still pending.
But, you may say, Intel in its "How to Benchmark Code Execution Times"
document shows that using rdtscp is superior to rdstc. Well, not
exactly. What they do show is that when a *single function* is
considered, there are ways to measure its execution time with little to
no error.
This is not what Tracy is doing.
In our case there is no way to determine absolute "this is before" and
"this is after" points of a zone, as we probably already are inside
another zone. Stopping the CPU execution, so that a deeply nested zone
may be measured with great precision, will skew the measurements of all
parent zones.
And this is not what we want to measure, anyway. We are not interested
in how a *single function* behaves, but how a *whole program* behaves.
The out-of-order CPU behavior may influence the measurements? Good! We
are interested in that. We want to see *how* the code is really
executed. How is *stopping* the CPU to make a timer read an appropriate
thing to do, when we want to see how a program is performing?
At least that's the theory.
And besides all that, the profiling overhead is now reduced.
2019-10-20 20:52:33 +02:00
Bartosz Taudul
30fc2f02ab
Omit calculation of on-stack variable address.
2019-10-20 19:42:29 +02:00
Bartosz Taudul
c3870f8837
Use proper type.
2019-10-10 20:30:08 +02:00
Bartosz Taudul
707f113bda
Add missing NOMINMAX definitions.
2019-10-10 20:29:06 +02:00
Bartosz Taudul
7cf3608493
Avoid unused variables.
2019-10-05 02:11:45 +02:00
Bartosz Taudul
e481b5ba22
Add missing thread sent indication.
2019-10-04 19:18:47 +02:00
Bartosz Taudul
9e1935f070
Make C API symbols visible across dlls.
2019-10-03 22:39:26 +02:00
Bartosz Taudul
130365f4ff
Inject tracy_systrace into filesystem and use instead of cat.
...
Statistics for a one-minute trace:
Capture tool | Running time | Running regions
---------------+--------------+-----------------
cat | 25.11 s | 392,300
tracy_systrace | 10.41 s | 12,249
2019-09-27 15:51:29 +02:00
Bartosz Taudul
3dba4088ee
Embed precompiled tracy_systrace for android.
2019-09-27 15:50:58 +02:00
Bartosz Taudul
e13cbf52fd
Allow changing tracy port in client.
2019-09-21 15:11:15 +02:00
Bartosz Taudul
a221f121ba
Extract lock state handling to a separate context class.
2019-09-21 14:55:14 +02:00
Bartosz Taudul
37661fd2ee
Fix 32 bit NEON version of DXT1 compression.
...
This reverts commit b32e8fa24e
.
Apparently it is possible to receive non-uniform data in alpha channel, which
breaks the original assumption about not needing the mask. This seemed to be a
problem only on 32 bit NEON implementation of DXT1 compression. Other
implementations handle such data without degradation of visual output.
2019-09-03 21:37:07 +02:00
Bartosz Taudul
7a6564feae
Only recycle producers, if there's no data in queue.
...
("The queue" is per-thread partial queue here.)
This fixes a problem where one thread writes to the queue, then is
terminated, making the (partially filled) queue available for other
threads to recycle. If another thread re-owns the queue, it will change
the associated thread id, while part of the queue was filled by the
original thread. This obviously created invalid data during dequeue.
The fix makes the recycling process check not only for queue inactivity
(which is marked when the original thread terminates), but also if the
queue is empty, preventing mixing data from different threads.
2019-08-30 14:28:44 +02:00
Bartosz Taudul
00b26c1acf
Fix TRACY_NO_SYSTEM_TRACING.
2019-08-26 18:02:10 +02:00
Bartosz Taudul
fbeee3cf61
Fix (?) invalid function pointer signature.
2019-08-26 17:59:58 +02:00
Bartosz Taudul
78127dc357
System threads only allow limited information queries.
2019-08-25 00:33:22 +02:00
Bartosz Taudul
deb59b4c38
Somehow fix event ordering.
2019-08-24 01:43:55 +02:00
Bartosz Taudul
1e74a89924
Check if there's data to read from kernel.
...
Reading from kernel pipe, while being a blocking operation, spin locks the
thread.
2019-08-24 01:06:21 +02:00
Bartosz Taudul
8f6e94d75c
Sleep if sys trace pipe buffer underruns.
2019-08-24 00:42:00 +02:00
Bartosz Taudul
2d50d07438
Allow completely disabling system tracing.
2019-08-21 01:16:25 +02:00
Bartosz Taudul
0cbb853945
Add missing SetThreadName() calls.
2019-08-20 16:23:00 +02:00
Bartosz Taudul
332262dd84
Shorter thread names.
2019-08-20 16:22:54 +02:00
Bartosz Taudul
247acd03ee
Kernel tracing on android.
2019-08-20 15:49:40 +02:00
Bartosz Taudul
e427d67347
Don't bail out if unimportant variables are not available.
2019-08-20 12:19:05 +02:00
Bartosz Taudul
bfda30be0b
Use su on android to set tracing variables.
2019-08-20 12:18:46 +02:00
Bartosz Taudul
9d87a8394d
Add missing getline() implementation for android API < 18.
2019-08-19 15:26:09 +02:00
Bartosz Taudul
9be6f4a414
Fix typo.
2019-08-19 13:03:37 +02:00
Bartosz Taudul
d209bb4d01
Add missing function pointer checks.
2019-08-19 12:47:27 +02:00
Bartosz Taudul
20e8a5ecc8
Create tid to pid mapping.
2019-08-17 22:32:41 +02:00
Bartosz Taudul
678e942e9f
Transfer PID of profiled program.
2019-08-17 22:19:04 +02:00
Bartosz Taudul
77c636c3fd
Retrieve module name for threads with no names on windows.
2019-08-17 21:24:40 +02:00
Bartosz Taudul
f7589bde02
Trace thread wakeups on linux.
2019-08-17 17:18:11 +02:00
Bartosz Taudul
414f903cc5
Collect thread wakeup data.
2019-08-17 17:05:29 +02:00
Bartosz Taudul
e9080bdbcd
Hardcode windows PID 4 as "System".
2019-08-17 03:44:47 +02:00
Bartosz Taudul
40eb8a5a03
Proper check for invalid handle.
2019-08-17 03:44:11 +02:00
Bartosz Taudul
6c1dd8eaec
Cast thread handle to DWORD.
2019-08-16 21:21:37 +02:00
Bartosz Taudul
d7104c752a
Cygwin compat layer.
2019-08-16 21:16:04 +02:00
Bartosz Taudul
819ef2a82b
External process/thread name retrieval on linux.
2019-08-16 21:00:42 +02:00
Bartosz Taudul
e975c4d7bf
Also retrieve external thread names.
2019-08-16 19:49:16 +02:00
Bartosz Taudul
fe7f56b022
Implement retrieval of external process names.
2019-08-16 19:22:23 +02:00
Bartosz Taudul
83fddd9aa6
Fix unicode builds.
2019-08-16 13:09:27 +02:00
Bartosz Taudul
9d5240c597
Mutable char array is required here due to shit API design.
2019-08-16 13:03:20 +02:00
Bartosz Taudul
14a373a3b8
Add number of CPU cores to host info.
2019-08-15 02:28:35 +02:00
Bartosz Taudul
69077e4e6f
Finish sending context switches during disconnect.
2019-08-14 23:06:13 +02:00
Bartosz Taudul
6dc79cf14e
Cosmetics.
2019-08-14 23:05:58 +02:00
Bartosz Taudul
c0b524d8de
Add a separate method for clearing serial queue.
2019-08-14 22:39:12 +02:00
Bartosz Taudul
71b54dd48a
Always collect thread names.
...
This fixes an issue when a thread was destroyed before its name could be
retrieved.
2019-08-14 16:52:04 +02:00
Bartosz Taudul
5e199d1ab3
Support ftrace on ARM.
2019-08-14 16:28:54 +02:00
Bartosz Taudul
5fbb811f5d
Degrade ARM timer to monotonic raw clock.
...
The monotonic raw clock has the same accuracy as reading cntvct registers, but
using clock_gettime() has a measurable impact on queueing time (135 us vs
83 us).
This change is needed to enable ftrace time readings on ARM linux, which
doesn't provide any way to get raw cntvct readings, like x86-tsc on x86.
2019-08-14 16:19:02 +02:00
Bartosz Taudul
42865d7c7b
Don't set x86-tsc clock on non-x86 platforms.
2019-08-14 15:14:36 +02:00
Bartosz Taudul
54a9132bb5
Skip context switch events in on demand mode, if no connection.
2019-08-14 15:09:33 +02:00
Bartosz Taudul
602c38c6c0
Allow checking timer implementation.
2019-08-14 14:35:44 +02:00
Bartosz Taudul
3988b56c92
Capture context switches on linux.
2019-08-14 13:56:15 +02:00
Bartosz Taudul
92b6da7cc2
SetThreadName() only works on the current thread.
...
This breaking change is required, because kernel trace facilities use
kernel thread ids, which are inaccessible from the pthread_t level.
2019-08-14 02:22:45 +02:00
Bartosz Taudul
73cbf2eead
Use windows thread ids on cygwin.
2019-08-13 16:22:58 +02:00
Bartosz Taudul
b313e46139
Keep event trace properties to terminate trace on exit.
2019-08-13 13:10:37 +02:00
Bartosz Taudul
90d26cb1b6
Collect and send context switch events.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
fe0f1aea07
Add system tracing skeleton.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
8aa0be39d5
Drop support for CPU id queries.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
d6f32a0839
Serialize lock processing.
...
This makes is much easier to process on the server and opens new
optimization possibilities. It also fixes theoretical problems, which
may be caused by invalid ordering of events with the same timestamp.
2019-08-12 13:51:01 +02:00
Bartosz Taudul
0431c03556
Add serial queue interface.
2019-08-12 13:27:15 +02:00
Bartosz Taudul
4d2c7899ab
Allow skipping invariant TSC check.
2019-08-08 19:21:39 +02:00
Bartosz Taudul
3a221dafde
Display error messages on console, if available.
2019-08-08 19:18:05 +02:00
Bartosz Taudul
aada588129
Proper buffer reset.
2019-08-04 17:48:19 +02:00
Rokas Kupstys
b391e4c21a
Fix multiple build errors when compiling with MinGW.
2019-08-04 15:49:46 +03:00
Bartosz Taudul
12969ee497
Track thread context.
...
This change exploits the fact that events are processed in batches
originating from a single thread. A single message changing thread
context is enough to handle multiple messages, as opposed to inclusion
of thread identifier in each message.
2019-08-02 20:18:08 +02:00
Bartosz Taudul
a4e7a341c0
Proper handling of disconnect request.
2019-08-01 23:14:09 +02:00
Bartosz Taudul
ca3571fd2b
Still more.
2019-07-30 01:30:31 +02:00
Bartosz Taudul
47423e6263
And more.
2019-07-30 01:29:13 +02:00
Bartosz Taudul
d3783ae359
Remove magic template syntax.
2019-07-30 01:28:21 +02:00
Bartosz Taudul
9c28b82954
RPMallocInit and RPMallocThreadInit are identical.
2019-07-30 01:15:14 +02:00
Bartosz Taudul
a6a3f45810
Fill in thread id during dequeue, not during enqueue.
2019-07-30 01:15:14 +02:00
Bartosz Taudul
142ef53b42
Dequeue items from a single thread.
2019-07-29 23:44:08 +02:00
Bartosz Taudul
c7f769c52b
Allow dequeuing from a single producer, retrieving thread id.
2019-07-29 23:29:30 +02:00
Bartosz Taudul
6cad76ae67
Store thread id in queue producer.
2019-07-29 23:13:06 +02:00
Bartosz Taudul
7ae9a28e32
Drop BlockingConcurrentQueue.
2019-07-29 22:58:13 +02:00
Bartosz Taudul
480a427e07
No need to hash thread ids anymore.
2019-07-29 22:36:04 +02:00
Bartosz Taudul
c60af95053
Remove unused const.
2019-07-29 22:33:32 +02:00
Bartosz Taudul
2d42abf552
Remove CannoAlloc functions.
2019-07-29 22:31:32 +02:00
Bartosz Taudul
b142860c8d
More implicit producer removal.
2019-07-29 22:29:39 +02:00
Bartosz Taudul
db6eceb1a6
Producers must be explicit.
2019-07-29 22:25:28 +02:00
Bartosz Taudul
89928fde7b
Queue must be always able to alloc.
2019-07-29 22:13:16 +02:00
Bartosz Taudul
a03734afa6
Remove more debug code.
2019-07-29 22:01:06 +02:00
Bartosz Taudul
e9a0145cd5
Remove MCDBGQ_NOLOCKFREE_IMPLICITPRODBLOCKINDEX.
2019-07-29 21:56:53 +02:00
Bartosz Taudul
b496f1ff90
Remove MOODYCAMEL_QUEUE_INTERNAL_DEBUG.
2019-07-29 21:52:49 +02:00
Bartosz Taudul
beaadc3a56
Remove always disabled MCDBGQ_TRACKMEM code.
2019-07-29 21:51:29 +02:00
Bartosz Taudul
82a4a6d9cc
Add tracy_ prefix to concurrentqueue.h file name.
2019-07-29 21:47:50 +02:00
Bartosz Taudul
276d764141
Fix cygwin.
2019-07-26 00:02:57 +02:00
Bartosz Taudul
36de7b2cc7
Fix incomplete headers.
2019-07-25 23:41:42 +02:00
Bartosz Taudul
e659220602
Use generic std::call_once() on other platforms.
2019-07-25 23:30:47 +02:00
Bartosz Taudul
d31d1f5946
Detect and report clang-cl.
2019-07-25 19:03:58 +02:00
Bartosz Taudul
092e830264
Use shifts instead of const vector and.
2019-07-22 19:56:47 +02:00
Bartosz Taudul
178dc9eba7
Combine block data directly in AVX registers.
2019-07-20 14:52:34 +02:00
Bartosz Taudul
a6300ef7d1
Ditto on ARM.
2019-07-19 22:13:56 +02:00
Bartosz Taudul
dc49f2f76a
Move DXT1 index conversion to server.
2019-07-19 21:46:58 +02:00
Bartosz Taudul
11ba77ced5
Use pthread_once() to initialize rpmalloc on linux.
2019-07-19 20:15:56 +02:00
Bartosz Taudul
4c28593031
Fix races in rpmalloc initialization.
...
Ensure rpmalloc_thread_initialize() int worker threads is called only after
rpmalloc_initialize() was called on the main profiler thread.
2019-07-19 19:25:27 +02:00
Bartosz Taudul
cef8124247
Replace or with addition to enable usra instruction.
2019-07-19 01:40:27 +02:00
Bartosz Taudul
fd4689a6e2
Don't perform unnecessary ands.
2019-07-19 01:19:52 +02:00
Bartosz Taudul
f65373ece7
Replace two packs with one shuffle.
2019-07-13 20:01:12 +02:00
Bartosz Taudul
fc83f97ad3
Same for AVX/SSE.
2019-07-13 19:34:08 +02:00
Bartosz Taudul
62a167541c
No need to mask out indices.
2019-07-13 19:07:25 +02:00
Alex
0c5ea710b0
Merged in z33ky/tracy/const-frame-image (pull request #37 )
...
Constify frame-image pointer in API.
2019-07-13 13:09:21 +00:00
Bartosz Taudul
7bb9549e84
ARM64 specific NEON implementation of DXT1 compression.
2019-07-13 14:31:33 +02:00
Alexander 'z33ky' Hirsch
c6e8dc8d63
Constify frame-image pointer in API.
2019-07-13 12:33:55 +02:00
Bartosz Taudul
60d2384a6a
Allow sending application information messages.
2019-07-12 18:34:46 +02:00
Bartosz Taudul
a1ce5fc1f6
Add include for built-in __get_cpuid() on gcc/clang.
2019-07-10 02:09:19 +02:00
Bartosz Taudul
c164a70b9d
Check for rdstcp/invariant tsc support.
2019-07-10 02:04:14 +02:00
Bartosz Taudul
c0670848d2
Reuse variable.
2019-07-08 02:08:06 +02:00
Bartosz Taudul
17dbbe67de
Remove dependency on range subtraction.
2019-07-08 00:14:36 +02:00
Bartosz Taudul
af1bd3e1fa
Faster horizontal add.
2019-07-07 23:57:23 +02:00
Bartosz Taudul
b32e8fa24e
Ditto for NEON.
2019-07-06 00:18:53 +02:00
Bartosz Taudul
d236d4b70f
Ditto for AVX2.
2019-07-06 00:05:32 +02:00
Bartosz Taudul
f62b21c21d
Masking alpha out is not needed.
...
We assume that alpha value is constant for the whole image. The range
calculation is max - min, so alpha zeroes out. The color normalization
to range is color - min, so alpha also zeroes out here.
2019-07-05 23:58:19 +02:00
Bartosz Taudul
03189a30b8
Two ands less in NEON DXT1 compression.
2019-07-05 18:37:25 +02:00
Bartosz Taudul
275d992cb1
Two ands less in AVX2 DXT1 compression.
2019-07-05 18:22:42 +02:00
Bartosz Taudul
c89358d6b9
Two ands less in SSE DXT1 compression.
2019-07-05 18:17:50 +02:00
Bartosz Taudul
5bfc62f1bf
iOS device name decoding.
2019-06-19 09:59:46 +02:00
Bartosz Taudul
59b4f84ce5
Display unknown implementer, part as hex values.
2019-07-03 21:18:17 +02:00
Bartosz Taudul
c6f6c368b2
Decode ARM CPU names.
2019-07-03 21:01:34 +02:00
Bartosz Taudul
e26ab8e9f6
Make forwarding functions more compact.
2019-07-03 18:05:38 +02:00
Bartosz Taudul
bdfb568742
Fix div tables for max range on all channels.
2019-07-01 12:31:06 +02:00
Bartosz Taudul
684a119a2c
Fix order of checks for including intrinsics.
2019-07-01 11:45:16 +02:00
Bartosz Taudul
983c48994b
Write block data directly to memory.
2019-06-30 11:44:32 +02:00
Bartosz Taudul
9b8c18f99e
Improve readability.
2019-06-30 11:44:00 +02:00
Bartosz Taudul
52b6bdb55a
Force inline ProcessRGB functions.
2019-06-30 03:33:14 +02:00
Bartosz Taudul
8c06f7288c
AVX2 DXT1 compression.
2019-06-30 03:20:58 +02:00
Bartosz Taudul
2e893bba91
Use division tables.
2019-06-29 12:16:49 +02:00
Bartosz Taudul
ab9f036f5e
Integrate CheckSolid into ProcessRGB.
2019-06-29 02:04:08 +02:00
Bartosz Taudul
faf6bb97a4
DXT1 NEON color index packing.
2019-06-28 22:36:44 +02:00
Bartosz Taudul
2df1eaaa7e
Pack color indices using SSE.
2019-06-28 21:58:10 +02:00
Bartosz Taudul
fcb5b4b888
NEON DXT1 compression.
2019-06-28 14:24:16 +02:00
Bartosz Taudul
e8d4ba492b
Unify shifts.
2019-06-28 13:05:32 +02:00
Bartosz Taudul
be4900c822
NEON CheckSolid.
2019-06-28 01:47:04 +02:00
Bartosz Taudul
3c066f1527
Simplify code.
2019-06-27 22:40:03 +02:00
Bartosz Taudul
72a0d4c2ab
Rest of SSE DXTC compression.
2019-06-27 22:29:44 +02:00
Bartosz Taudul
137b28e110
SSE CheckSolid.
2019-06-27 22:29:44 +02:00
Bartosz Taudul
3d590b6b8c
Initialize rpmalloc in compression thread.
2019-06-27 19:14:51 +02:00
Bartosz Taudul
1939c31165
Experimental DXT1 compressor.
2019-06-27 19:14:51 +02:00
Bartosz Taudul
79eb1b9029
Swap queue and dequeue only if queue has contents.
2019-06-27 13:37:09 +02:00
Bartosz Taudul
bb35f9a897
Compress frame images in a separate thread.
2019-06-27 13:24:35 +02:00
Bartosz Taudul
7ebd2162c6
Add ETC1 compression thread.
2019-06-26 22:57:24 +02:00
Bartosz Taudul
f565e11976
Store frame images in queue.
2019-06-26 22:52:24 +02:00
Bartosz Taudul
281dcf7c1f
Cast to proper types.
2019-06-26 19:33:37 +02:00
Bartosz Taudul
8ce41b3543
Proper init order of thread local thread handle.
2019-06-26 19:32:52 +02:00
Bartosz Taudul
bc7f2c49c8
GetThreadHandle() might be used by application's code.
2019-06-25 15:44:49 +02:00
Bartosz Taudul
c749a2e3fe
Add C API for plots and messages.
2019-06-24 21:03:39 +02:00
Bartosz Taudul
48e08acb62
Add C API for frame markup.
2019-06-24 21:03:39 +02:00
Bartosz Taudul
ee99ce833c
Implement memory allocation tracking for C API.
2019-06-24 21:03:39 +02:00
Bartosz Taudul
281477f7f9
Tokens must be retrieved for each enqueue.
2019-06-24 20:12:14 +02:00
Bartosz Taudul
06a41708a7
Move TLS accesses close together.
2019-06-24 19:38:44 +02:00
Bartosz Taudul
c4f0965851
Don't use cached thread id to retrieve main thread id.
2019-06-24 19:38:07 +02:00
Bartosz Taudul
a56c47a6a0
Store thread handle in a thread local variable.
...
This saves us a non-inlineable function call. Thread local block is
accessed anyway, since we need to get the token, so we already have the
pointer and don't need to get it a second time (which is done inside
Windows' GetCurrentThreadId()). We also don't need to store the thread
id in ScopedZone anymore, as it was a micro-optimization to save us the
second GetThreadHandle() call.
This change has a measurable effect of reducing enqueue time from ~10 to
~8 ns.
A further optimization would be to completely skip thread handle
retrieval during zone capture and do it instead on retrieval of data
from the queue. Since each thread has its own producer ("token"), the
thread handle should be accessible during the dequeue operation. This is
a much more invasive change, that would require a) modification of the
queue, b) additional processing of dequeued data to inject the thread
handle.
2019-06-24 19:19:47 +02:00
Bartosz Taudul
fd9fc880a6
Send current time in on-demand welcome message.
2019-06-21 19:39:41 +02:00
Bartosz Taudul
5309e6d94a
Broadcast client activity time.
2019-06-18 20:46:12 +02:00
Bartosz Taudul
aa5259b20a
Use the same port (8086) for both TCP and UDP traffic.
2019-06-18 20:28:03 +02:00
Bartosz Taudul
0e5a7263d9
Define broadcast message, add versioning.
2019-06-18 20:26:40 +02:00
Bartosz Taudul
0b394c3f53
Don't need to keep last broadcast time in Profiler class.
2019-06-18 20:15:09 +02:00
Bartosz Taudul
11dc8e67e5
Change broadcast rate from 5s to 3s.
2019-06-17 19:57:17 +02:00
Bartosz Taudul
6bf8081f5b
Remove debug leftovers.
2019-06-17 19:52:44 +02:00
Bartosz Taudul
de058d2a0d
Don't hardcode broadcast port.
2019-06-17 18:37:34 +02:00
Bartosz Taudul
1b3b3a94a2
Broadcast protocol version and process name.
2019-06-17 18:34:35 +02:00
Bartosz Taudul
0b9ef7e514
Disable broadcast if TRACY_NO_BROADCAST is defined.
2019-06-17 18:18:58 +02:00
Bartosz Taudul
e609c0fdce
UDP broadcast loop.
2019-06-17 02:25:09 +02:00
Bartosz Taudul
014c3ed63b
Use non-reference, optimized NEON ETC1 compression.
2019-06-15 15:35:57 +02:00
Bartosz Taudul
ab4e99229d
Indicate whether client is running on apple shitware.
2019-06-13 14:05:15 +02:00
Bartosz Taudul
e5d5abf59a
Add NEON path for ETC1 compression.
2019-06-13 02:04:19 +02:00
Bartosz Taudul
d3e0163dd4
Add byteswap for apple.
2019-06-12 16:54:44 +02:00
Bartosz Taudul
37d1457b44
Frame image may need flipping.
2019-06-12 15:28:32 +02:00
Bartosz Taudul
04dd33f5c4
Fix mismatched linkage.
2019-06-11 23:51:12 +02:00
Rokas K. (rku)
c4e05b6264
Merged in rokups/tracy/dllimport-cleanup (pull request #36 )
...
Clean up imported functions in multi-dll projects.
Approved-by: Till Rathmann <till.rathmann@gmx.de>
2019-06-11 15:04:34 +00:00
Bartosz Taudul
57b8b425ba
Discard send buffer data after disconnect.
2019-06-10 02:11:29 +02:00
Bartosz Taudul
80dff1ede1
Add connection id for on-demand mode.
...
Long-lived zones could send their end events without begin events in a
following scenario:
1. On-demand connection is made.
2. Zone begin is emitted, m_active is set to true.
3. Connection is terminated.
4. A new connection is made.
5. Zone end is emitted, because m_active is true.
To this point it was assumed that all zone end events will happen before
a new connection is made, but it's not necessarily true.
2019-06-09 17:15:47 +02:00
Bartosz Taudul
0db9c73d76
Immediately react to connection termination.
2019-06-09 16:51:39 +02:00
Bartosz Taudul
cc5bad294a
More strict memory ordering for on-demand connection status.
2019-06-09 16:48:00 +02:00
Bartosz Taudul
e2d42fae2f
We're done here, don't try to send termination request.
2019-06-09 16:25:52 +02:00
Bartosz Taudul
496f866add
Don't send data when connection is terminated.
...
There are only two cases for which HandleServerQuery() returns false.
Either data can't be read from the socket (which is checked by HasData()
call before calling HandleServerQuery()), or if the server sent
termination query. In both these cases there's no need to send data
anymore.
2019-06-09 16:19:40 +02:00
Bartosz Taudul
23e7850162
Make DequeueStatus enum class.
2019-06-09 16:14:30 +02:00
Bartosz Taudul
34d89d39a1
Prevent double freeing of socket.
2019-06-09 16:10:49 +02:00
Bartosz Taudul
139299389b
Add comments to client connection handling.
2019-06-09 16:10:49 +02:00
Bartosz Taudul
4c2ff80ac8
Restore frame counting for on-demand mode.
2019-06-09 15:23:01 +02:00
Bartosz Taudul
00a468162d
Fix signed/unsigned comparison.
2019-06-08 00:57:25 +02:00
Bartosz Taudul
9ef128995a
Add AVX2 version of etcpak.
2019-06-08 00:50:39 +02:00
Bartosz Taudul
7e9539ef2d
AVX implies SSE 4.1.
2019-06-08 00:39:19 +02:00
Bartosz Taudul
784c4da53a
Include frame offset in frame image message.
2019-06-07 20:09:29 +02:00
Rokas Kupstys
9bd1037347
Clean up imported functions in multi-dll projects.
2019-06-07 19:50:08 +03:00
Bartosz Taudul
d271634a95
Keep one ETC1 compression buffer.
2019-06-07 01:29:24 +02:00
Bartosz Taudul
34a6fe7055
_bswap may be already defined.
2019-06-07 01:07:51 +02:00
Bartosz Taudul
a654b642ef
Compress frame images to ETC1 before sending.
2019-06-07 00:31:51 +02:00
Bartosz Taudul
aff3246f82
Add ETC1 compressor.
2019-06-07 00:31:51 +02:00
Bartosz Taudul
e5bb6011c5
Frame image transfer prototype.
2019-06-06 21:39:54 +02:00
Bartosz Taudul
b3812146cb
Fix atomics initialization.
2019-05-27 14:09:55 +02:00
Bartosz Taudul
340837e202
Callstack decode for android api <= 21.
...
libbacktrace/elf.cpp:3249:3: error: use of undeclared identifier 'dl_iterate_phdr'
2019-05-22 14:14:30 +02:00
Bartosz Taudul
84efe070fe
Make callstack logic more obvious.
2019-05-22 14:05:44 +02:00
Bartosz Taudul
efc54babe3
Transfer of colored messages.
2019-05-10 20:17:44 +02:00
Bartosz Taudul
9ec8704dad
Don't include LZ4 headers in tracy headers.
...
The LZ4 implementation is wrapped in tracy namespace, but it also adds
some defines, which may conflict with other LZ4 implementations.
2019-05-01 12:57:42 +02:00
Bartosz Taudul
2c9d9d0d27
/proc/stat might be inaccessible.
2019-04-04 15:25:26 +02:00
Bartosz Taudul
302ad87686
Fix typo.
2019-03-21 22:06:37 +01:00
Bartosz Taudul
94ed1c637c
Try to check if cntcvt reads are monotonic.
...
https://lore.kernel.org/patchwork/patch/904607/
2019-03-21 21:59:51 +01:00
Bartosz Taudul
7f57b3dba9
Fallback to reading CLOCK_MONOTONIC_RAW, if available.
2019-03-21 21:49:23 +01:00
Bartosz Taudul
17fb589415
Try dladdr() resolution if libbacktrace fails.
2019-03-05 20:43:47 +01:00
Bartosz Taudul
49f1277e55
Cast void* to char*.
2019-03-05 20:20:55 +01:00
Bartosz Taudul
afe2fad1a7
Send native callstack before allocated one.
2019-03-05 19:18:43 +01:00
Bartosz Taudul
4509412efb
Fast callstack retrieval for linux.
2019-03-05 18:56:39 +01:00
Bartosz Taudul
1bbf296351
Use fast callstack frame decoding to cut callstack.
2019-03-05 02:42:51 +01:00
Bartosz Taudul
cb62b63fe2
Fast callstack frame decoder.
...
Returns only function name, doesn't retrieve inlined functions, doesn't
perform demangling.
2019-03-05 02:42:51 +01:00
Bartosz Taudul
b11f932078
Cut lua callstack at lua_pcall.
2019-03-05 02:42:51 +01:00
Bartosz Taudul
ec73178733
Move callstack cutting to a separate function.
2019-03-05 02:42:51 +01:00
Bartosz Taudul
d229c1bc1b
Send native callstack along with allocated callstack.
2019-03-05 02:42:50 +01:00
Bartosz Taudul
bef31ba073
Separate message for zone begin with alloc src loc and callstack.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
e3c31e4a4e
Send callstack alloc payload.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
d863245b49
Serialize discontinuous frame messages.
2019-02-28 19:21:23 +01:00
Bartosz Taudul
b89db6e926
Don't send CPU usage data when there's no readings.
2019-02-25 15:11:35 +01:00
Bartosz Taudul
963d2b3ca8
CPU usage getter for apple.
2019-02-25 15:04:06 +01:00
Bartosz Taudul
85f29a0f22
Collect system time before server connection is made.
2019-02-24 19:12:17 +01:00
Bartosz Taudul
bafc8a1330
Implement getting CPU usage in linux.
2019-02-24 19:02:49 +01:00
Bartosz Taudul
0b9fa8f3c8
Track CPU usage also on cygwin.
2019-02-21 23:11:09 +01:00
Bartosz Taudul
9f4f5bcb63
CPU usage retrieval.
2019-02-21 22:45:53 +01:00
Bartosz Taudul
938d8ce69e
Properly initialize demangled pointer.
2019-02-21 15:04:17 +01:00
Bartosz Taudul
44009b6fda
Use mach_absolute_time() to get time on iOS.
2019-02-21 14:45:13 +01:00
Bartosz Taudul
e839a3153f
Just use getprogname().
2019-02-21 11:40:56 +01:00
Bartosz Taudul
c4d46f1c24
No libproc.h on iOS.
2019-02-21 11:33:45 +01:00
Till Rathmann
9d7c4a2861
Merged in tillrathmann/tracy (pull request #33 )
...
Fixed DLL support
2019-02-20 17:24:12 +00:00
Till Rathmann
29140afe0c
Fixed compiler warnings.
2019-02-20 17:50:49 +01:00
Till Rathmann
77abc3bffd
Fixed DLL support.
2019-02-20 16:15:13 +01:00
Bartosz Taudul
22329ae5d9
Collect call stacks on apple.
2019-02-20 16:01:41 +01:00
Bartosz Taudul
34d24b16bb
Retrieve memory size on apple.
2019-02-20 13:52:55 +01:00
Bartosz Taudul
9c966b6224
Process name retrieval on apple.
2019-02-20 13:13:29 +01:00
Bartosz Taudul
8f75839d66
Fix apple target detection.
2019-02-20 12:43:48 +01:00
Bartosz Taudul
5afadcb11d
Fix if condition.
2019-02-19 21:51:41 +01:00
Bartosz Taudul
ef5e30056e
Implement delayed initialization of the profiler.
...
Enabled on osx, ios.
2019-02-19 20:43:30 +01:00
Bartosz Taudul
3f914834b7
Hide rest of statics.
2019-02-19 19:33:37 +01:00
Bartosz Taudul
9fabafbeca
Fix DLL code.
2019-02-19 18:46:59 +01:00
Bartosz Taudul
2421e05c27
Prevent direct access to s_profiler.
2019-02-19 18:38:08 +01:00
Bartosz Taudul
d865d1cc87
Disallow direct access to s_token.
2019-02-19 18:27:00 +01:00
Bartosz Taudul
44753dd4ac
thread_local implies static.
2019-02-19 16:52:05 +01:00
Bartosz Taudul
c7e64bb8a8
Replace select() with poll().
2019-02-10 15:45:23 +01:00
Bartosz Taudul
9dd869a5eb
Fix call stacks on cygwin.
2019-02-02 13:58:17 +01:00
Bartosz Taudul
653caf159f
Assign return value only once.
2019-01-29 22:21:01 +01:00
Bartosz Taudul
a708bebbfd
Use language neutral header for callstack capability detection.
...
This fixes call stack collection in C API when TRACY_CALLSTACK is
defined.
2019-01-27 13:41:32 +01:00
Bartosz Taudul
01bddf95a6
Trace inline function calls on MSVC call stacks.
2019-01-26 23:50:58 +01:00
Bartosz Taudul
49b0a3500d
Enable tracing incline functions in callstacks.
2019-01-20 19:33:37 +01:00
Bartosz Taudul
ddad475c19
Make it possible to store multiple frames at single frame address.
2019-01-20 19:11:48 +01:00
Bartosz Taudul
bf7cc0a0d5
Add missing header for PRIxMAX.
2019-01-20 17:17:09 +01:00
Bartosz Taudul
9e7714c45a
Decode callstack frames using libbacktrace.
2019-01-20 16:55:59 +01:00
Rokas Kupstys
36c76456f7
Fix mistakes from MingW support commit.
2019-01-19 15:03:43 +02:00
Rokas Kupstys
8157e3a0b3
Fix builds with MingW.
2019-01-19 13:53:10 +02:00
Bartosz Taudul
92f3a4bba0
Add ZoneText and ZoneName to the C API.
2019-01-16 02:10:21 +01:00
Bartosz Taudul
b72d30af80
Allow disabling zone verification.
2019-01-15 18:59:05 +01:00
Bartosz Taudul
8e52ab318b
Send zone validation messages.
...
This is only performed for C API, as C++ scoped zones are always
properly ordered, due to RAII. With manual submission of zone begin and
end events there's no such guarantee.
2019-01-14 22:36:54 +01:00
Bartosz Taudul
970108fbbf
Track zone id for verification purposes.
2019-01-14 22:36:54 +01:00
Bartosz Taudul
1a8518dcc2
Allow filtering zones in on-demand mode.
2019-01-14 22:36:54 +01:00
Bartosz Taudul
1f0d1fdfdc
C API prototype.
2019-01-14 21:07:29 +01:00
Bartosz Taudul
070888f80d
Make it possible to have multiple vulkan contexts.
...
API change!
2019-01-10 17:11:17 +01:00
Bartosz Taudul
b1ba2f9bf7
Fix extern "C" initialization.
2018-12-29 01:00:14 +01:00
Bartosz Taudul
1733961885
Proper printf type for DWORDLONG on cygwin.
2018-12-29 01:00:14 +01:00
Bartosz Taudul
ee718f18d9
Cygwin headers provide their own FORCEINLINE macro.
2018-12-29 01:00:14 +01:00
Bartosz Taudul
0a6c6606bf
Don't use MSVC pragmas on gcc/clang (cygwin).
2018-12-29 01:00:14 +01:00
Miguel Fernandez
baa870fa8c
Moved NoMinMax before windows.h
2018-12-24 18:50:52 +00:00
Miguel Fernandez
7c164375a4
Moved NoMinMax inside _MSC_VER
2018-12-24 18:49:53 +00:00
Miguel Fernandez
51bdb004f9
Avoid conflicts with min/max macros
2018-12-24 15:26:50 +00:00
Bartosz Taudul
e9ce8fdfda
Flush queues when opening listen socket fails.
2018-12-21 18:14:30 +01:00
Bartosz Taudul
a4be9b51b0
Use common queue clearing function.
2018-12-21 18:12:26 +01:00
Bartosz Taudul
331693d7f1
Use proper pattern for acquiring serial lock.
...
This fixes a potential hang during crash handling. Also, lock duration
is reduced.
2018-12-21 18:11:09 +01:00
Rokas Kupstys
a931b9eaf1
HOST_NAME_MAX and LOGIN_NAME_MAX availability is not consistent across linux/android/macos platforms. However all of them do have versions of these macros with _POSIX_ prefix.
...
In addition to that hostname and user variables may be uninitialized in some configurations, however they are always used. Initializing these arrays fixes conditional depending on uninitialized memory warning uncovered by valgrind.
2018-12-18 17:19:03 +02:00
Bartosz Taudul
083320820f
OSX doesn't define HOST_NAME_MAX and LOGIN_NAME_MAX.
...
Fix based on patch from Jack Skalski.
2018-12-17 15:11:59 +01:00
Bartosz Taudul
a7e615d42e
Cosmetics.
2018-12-17 15:09:10 +01:00
Bartosz Taudul
0b816ce0b7
Add lock termination event.
2018-12-16 20:46:33 +01:00
Bartosz Taudul
61ac0b8afc
Send lock creation time.
2018-12-16 20:33:18 +01:00
Bartosz Taudul
f19b559f6e
InitOnceExecuteOnce requires targeting Windows Vista.
...
Cygwin fix.
2018-11-25 19:03:17 +01:00
Sherief Farouk
591f04ad0f
Renamed preprocessor #define for consistency.
2018-10-28 22:41:08 -07:00
Sherief Farouk
5110d55f17
Fix for using Tracy with multithreaded NT loader in Windows 10 RS5 (Issue #26 ) [Take 2].
2018-10-28 18:55:55 -07:00
Sherief Farouk
27447902ef
Fix for using Tracy with multithreaded NT loader in Windows 10 RS5 (Issue #26 ).
2018-10-27 18:13:59 -07:00
Bartosz Taudul
6be66d7a3c
Fix on-demand mode.
2018-09-09 19:44:41 +02:00
Bartosz Taudul
9211ce42da
Non-on-demand client is only able to handle one connection.
2018-09-09 19:42:06 +02:00
Bartosz Taudul
984a711666
Send protocol version to verify handshake.
2018-09-09 19:28:53 +02:00
Bartosz Taudul
db1d7d2c92
Free socket after disconnection.
2018-09-09 18:31:06 +02:00
Bartosz Taudul
270072b09e
Require shibboleth match at start of connection.
2018-09-09 18:26:53 +02:00
Bartosz Taudul
00da3ba6eb
SEGV_{BND,PKU}ERR might not be defined.
2018-08-27 14:45:07 +02:00
Bartosz Taudul
2ebe9b72d1
There's no getlogin_t() on android.
2018-08-27 13:59:19 +02:00
Bartosz Taudul
a1a9f6d610
Fix printf types.
2018-08-22 16:31:09 +02:00
Bartosz Taudul
8a78fcd2f9
Cut off Linux stack trace at sigreturn.
2018-08-21 01:53:00 +02:00
Bartosz Taudul
22346feea3
Fun fact: two threads can crash at the same time.
2018-08-21 01:45:33 +02:00
Bartosz Taudul
47943d6a86
Use proper type.
2018-08-21 01:24:00 +02:00
Bartosz Taudul
facb05f8cb
Don't mark FastVector element as used until it's ready.
...
This should prevent a race condition that would result in invalid last
element of the queue, in case a freezed thread already got the queue
item, but didn't wrote to it (or didn't wrote fully).
2018-08-20 22:35:50 +02:00
Bartosz Taudul
8c0ff67796
Cut windows crash call stack at the exception dispatcher.
2018-08-20 22:21:35 +02:00
Bartosz Taudul
d1adf9e8d6
Allow skipping functions on top of call stack.
...
Note that this is on-client performance intensive and shouldn't be used,
except in special situations, like processing crashes.
2018-08-20 22:20:44 +02:00
Bartosz Taudul
b371003336
In case of manual shutdown, don't wait for lock.
...
All threads are freezed at this point, nothing will release it.
2018-08-20 21:49:23 +02:00
Bartosz Taudul
401ebd6f3d
Use spin-lock in DequeueSerial.
...
A thread freezed during crash processing may hold the lock and never
release it. The old behavior would cause deadlock in such situation. The
new one can be modified to work. Also, we don't want to use timed mutex.
2018-08-20 21:40:13 +02:00
Bartosz Taudul
6d45434cb5
Implement crash handler on Linux.
2018-08-20 14:30:56 +02:00
Bartosz Taudul
53aee0e03d
Fix warning.
2018-08-20 12:53:14 +02:00
Bartosz Taudul
3b526b074e
Send crash report.
2018-08-20 02:23:55 +02:00
Bartosz Taudul
49e36c013f
Only handle selected subset of exceptions.
2018-08-20 02:06:59 +02:00
Bartosz Taudul
0258f4a7b4
Handle crashes on windows.
...
When a crash happens, put all threads (bar the profiler and crash
handling ones) into the freezer, send crash notification message,
request profiler shutdown and when it does, terminate process.
The list of ignored exceptions is sorta-kinda random at the moment and
may need further expansion.
2018-08-20 01:07:33 +02:00
Bartosz Taudul
ca939ccd19
Allow external profiler shutdown requests.
2018-08-20 01:02:27 +02:00
Bartosz Taudul
d63b5431bf
Discover linux kernel version.
2018-08-19 19:00:01 +02:00
Bartosz Taudul
f55b99ba7e
Fix signed/unsigned.
2018-08-19 18:53:32 +02:00
Bartosz Taudul
e9170c862e
System RAM discovery on Linux.
2018-08-19 18:52:04 +02:00
Bartosz Taudul
790a3ae26f
Perform windows version discovery.
2018-08-19 18:43:26 +02:00
Bartosz Taudul
bd76f4cd10
Send host info in welcome message.
2018-08-19 18:19:12 +02:00
Bartosz Taudul
9c0e6620b3
Host info discovery.
2018-08-19 18:15:46 +02:00
Arvid Gerstmann
076e83635b
Add possibility to explicitly avoid logging
2018-08-13 14:47:52 +02:00
Bartosz Taudul
9d051cf5ee
Add support for discontinuous frames.
2018-08-05 02:15:54 +02:00
Bartosz Taudul
9b4348b497
Handle frame name queries.
2018-08-04 21:10:45 +02:00
Bartosz Taudul
adde6cf4fd
Allow sending named frames.
2018-08-04 15:04:18 +02:00
Bartosz Taudul
922882d3b0
Add name field to frame mark message.
2018-08-04 15:03:47 +02:00
Till Rathmann
c71d99c134
Minor change: adapted the spaces to tabs at the just inserted line as in tracy_rpmalloc.cpp tabs are used as indentation.
2018-08-02 11:53:04 +02:00
Till Rathmann
4968717313
Fixed compiler warning about unused variable in release builds.
2018-08-02 11:45:15 +02:00
Till Rathmann
3b302315f9
Fixed __ANDROID_API__ < 21 build and FD_SET usage.
2018-08-01 19:18:40 +02:00
Till Rathmann
37d5736bf5
Fixed compiler warnings.
2018-08-01 14:07:30 +02:00
Till Rathmann
2dcfe5fce0
Made s_threadNameDataInstance and s_profilerInstance static.
2018-07-31 13:03:09 +02:00
Till Rathmann
dd042619e9
Support for multi-DLL projects.
2018-07-31 12:06:04 +02:00
Bartosz Taudul
31c2ddb8ac
Rename client's SourceLocation to SourceLocationData.
2018-07-28 00:34:04 +02:00
Bartosz Taudul
3737e122cf
Of course, this can't work without stupid fuckery.
2018-07-26 19:59:55 +02:00
Arvid Gerstmann
b8db9df949
Detect glibc explicitly
2018-07-14 13:23:00 +02:00
Arvid Gerstmann
ad48c32e1e
Support for callstacks on Linux without glibc
2018-07-14 11:08:17 +02:00
Bartosz Taudul
561d2dc360
Use the fastest mutex available.
...
The selection is based on the following test results:
MSVC:
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.641 ns/iter
2 thread contention: 141.559 ns/iter
3 thread contention: 242.733 ns/iter
4 thread contention: 409.807 ns/iter
5 thread contention: 561.544 ns/iter
6 thread contention: 785.845 ns/iter
=> std::mutex
No contention: 19.190 ns/iter
2 thread contention: 39.305 ns/iter
3 thread contention: 58.999 ns/iter
4 thread contention: 59.532 ns/iter
5 thread contention: 103.539 ns/iter
6 thread contention: 110.314 ns/iter
=> std::shared_timed_mutex
No contention: 45.487 ns/iter
2 thread contention: 96.351 ns/iter
3 thread contention: 142.871 ns/iter
4 thread contention: 184.999 ns/iter
5 thread contention: 336.608 ns/iter
6 thread contention: 542.551 ns/iter
=> std::shared_mutex
No contention: 10.861 ns/iter
2 thread contention: 17.495 ns/iter
3 thread contention: 31.126 ns/iter
4 thread contention: 40.468 ns/iter
5 thread contention: 15.677 ns/iter
6 thread contention: 64.505 ns/iter
Cygwin (clang):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.536 ns/iter
2 thread contention: 121.082 ns/iter
3 thread contention: 396.430 ns/iter
4 thread contention: 672.555 ns/iter
5 thread contention: 1327.761 ns/iter
6 thread contention: 14151.955 ns/iter
=> std::mutex
No contention: 62.583 ns/iter
2 thread contention: 3990.464 ns/iter
3 thread contention: 7161.189 ns/iter
4 thread contention: 9870.820 ns/iter
5 thread contention: 12355.178 ns/iter
6 thread contention: 14694.903 ns/iter
=> std::shared_timed_mutex
No contention: 91.687 ns/iter
2 thread contention: 1115.037 ns/iter
3 thread contention: 4183.792 ns/iter
4 thread contention: 15283.491 ns/iter
5 thread contention: 27812.477 ns/iter
6 thread contention: 35028.140 ns/iter
=> std::shared_mutex
No contention: 91.764 ns/iter
2 thread contention: 1051.826 ns/iter
3 thread contention: 5574.720 ns/iter
4 thread contention: 15721.416 ns/iter
5 thread contention: 27721.487 ns/iter
6 thread contention: 35420.404 ns/iter
Linux (x64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 13.487 ns/iter
2 thread contention: 210.317 ns/iter
3 thread contention: 430.855 ns/iter
4 thread contention: 510.533 ns/iter
5 thread contention: 1003.609 ns/iter
6 thread contention: 1787.683 ns/iter
=> std::mutex
No contention: 12.403 ns/iter
2 thread contention: 157.122 ns/iter
3 thread contention: 186.791 ns/iter
4 thread contention: 265.073 ns/iter
5 thread contention: 283.778 ns/iter
6 thread contention: 270.687 ns/iter
=> std::shared_timed_mutex
No contention: 21.509 ns/iter
2 thread contention: 150.179 ns/iter
3 thread contention: 256.574 ns/iter
4 thread contention: 415.351 ns/iter
5 thread contention: 611.532 ns/iter
6 thread contention: 944.695 ns/iter
=> std::shared_mutex
No contention: 20.805 ns/iter
2 thread contention: 157.034 ns/iter
3 thread contention: 244.025 ns/iter
4 thread contention: 406.269 ns/iter
5 thread contention: 387.985 ns/iter
6 thread contention: 468.550 ns/iter
Linux (arm64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 20.891 ns/iter
2 thread contention: 211.037 ns/iter
3 thread contention: 409.962 ns/iter
4 thread contention: 657.441 ns/iter
5 thread contention: 828.405 ns/iter
6 thread contention: 1131.827 ns/iter
=> std::mutex
No contention: 50.884 ns/iter
2 thread contention: 103.620 ns/iter
3 thread contention: 332.429 ns/iter
4 thread contention: 620.802 ns/iter
5 thread contention: 783.943 ns/iter
6 thread contention: 834.002 ns/iter
=> std::shared_timed_mutex
No contention: 64.948 ns/iter
2 thread contention: 173.191 ns/iter
3 thread contention: 490.352 ns/iter
4 thread contention: 660.668 ns/iter
5 thread contention: 1014.546 ns/iter
6 thread contention: 1451.553 ns/iter
=> std::shared_mutex
No contention: 64.521 ns/iter
2 thread contention: 195.222 ns/iter
3 thread contention: 490.819 ns/iter
4 thread contention: 654.786 ns/iter
5 thread contention: 955.759 ns/iter
6 thread contention: 1282.544 ns/iter
2018-07-14 00:39:01 +02:00
Arvid Gerstmann
9ac47eef0a
Merged in Leandros99/tracy/dev (pull request #9 )
...
Couple of minor compatibility fixes
2018-07-13 22:05:13 +00:00
Bartosz Taudul
e285c837a4
Support TRACY_NO_EXIT env variable in addition to define.
2018-07-13 23:55:40 +02:00
Arvid Gerstmann
32fc011f80
Silence unused parameter warning
2018-07-13 23:39:25 +02:00
Bartosz Taudul
c3ba0ef4eb
Fix lua zone state init.
2018-07-13 20:21:50 +02:00
Bartosz Taudul
26f2cb336e
Return value from non-void function.
2018-07-13 20:12:39 +02:00
Bartosz Taudul
a3c898f8b8
Rename FrameMark() to SendFrameMark().
...
This avoids conflict with FrameMark define.
2018-07-13 20:09:19 +02:00
Arvid Gerstmann
6b87aecdce
Wrap concurrentqueue in tracy namespace
2018-07-13 20:01:27 +02:00
Bartosz Taudul
b11695111d
Implement on-demand Lua zone capture.
2018-07-12 12:53:35 +02:00
Bartosz Taudul
fbc5556ddd
Send memory events in on-demand mode.
2018-07-12 01:36:01 +02:00
Bartosz Taudul
26d5c4b302
Fix copy pasta.
2018-07-11 14:43:38 +02:00
Bartosz Taudul
96f39281a1
Implement on-demand locks.
2018-07-11 14:17:20 +02:00
Bartosz Taudul
d87508901f
Send deferred data.
2018-07-11 12:28:40 +02:00
Bartosz Taudul
ad0a75da7d
Defer lock announcements.
2018-07-11 12:24:58 +02:00
Bartosz Taudul
475d151b2d
Implement deferring items.
2018-07-11 12:21:39 +02:00
Bartosz Taudul
a99d74966c
Active status of scoped zone can't change.
2018-07-11 12:16:55 +02:00
Bartosz Taudul
52207f20b7
Add deferred events queue.
2018-07-11 12:14:28 +02:00
Bartosz Taudul
c2659473fd
Free memory associated with cleared queue items.
2018-07-11 01:34:48 +02:00
Bartosz Taudul
b1a71174db
Messages are also safe.
2018-07-10 23:09:59 +02:00
Bartosz Taudul
e80c677fa0
Plots can be safely sent in on-demand mode.
2018-07-10 23:06:27 +02:00
Bartosz Taudul
6a9caabc63
Send on-demand initial payload message.
2018-07-10 22:37:39 +02:00
Bartosz Taudul
43d5ab4382
Count frames in on-demand mode.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
03794a2957
Send frame marks in on-demand mode.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
f8b2ffdc7e
Clear queues before new on-demand connection is made.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
a767c5ea08
Trace zones in on-demand mode.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
c973735b49
Track connection status.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
010b19946f
Send on-demand status in welcome message.
2018-07-10 21:44:23 +02:00
Bartosz Taudul
c056f3be41
Send keep alive messages to determine if client disconnected.
2018-07-10 21:39:17 +02:00
Bartosz Taudul
e5b133073c
Disable all tracing if TRACY_ON_DEMAND is defined.
2018-07-10 20:49:51 +02:00
Tobias Widlund
626a995c63
Add size_t casts in asserts to get rid of sign-compare warnings on GCC
2018-07-01 20:02:53 +02:00
Tobias Widlund
273355b665
Change system include from using "" to <>
2018-06-30 16:00:51 +02:00
Tobias Widlund
b6cce4ddb6
Improve fixes for warnings as per request
2018-06-30 15:36:06 +02:00
Tobias Widlund
1c467a5847
Fix warning re shadowing, implicit conversion and added include <cstdio>
2018-06-30 11:47:27 +02:00
Bartosz Taudul
b29d60056a
Custom per-zone name transfer.
2018-06-29 16:01:31 +02:00
Bartosz Taudul
84c34ad826
Handle unicode builds.
2018-06-25 10:55:07 +02:00
Bartosz Taudul
64a38c591b
Don't perform multiple NeedDataSize checks.
2018-06-23 02:19:23 +02:00
Bartosz Taudul
4d197ec7a2
Unsafe version of AppendData.
2018-06-23 02:16:58 +02:00
Bartosz Taudul
a2c6848433
Send callstack payload without iteration, if possible.
2018-06-23 02:13:52 +02:00
Bartosz Taudul
a7ace6ef9e
Directly use RtlWalkFrameChain.
...
RtlCaptureStackBackTrace is just a wrapper for RtlWalkFrameChain.
2018-06-23 02:07:47 +02:00
Bartosz Taudul
19e83b434e
Increase max length of symbol on windows.
2018-06-23 00:27:14 +02:00
Bartosz Taudul
f0ce7de193
Move callstack collection in mem events out of critical section.
2018-06-22 23:00:03 +02:00
Bartosz Taudul
55ddb64352
GPU context counter is now 8 bit.
2018-06-22 15:10:23 +02:00
Bartosz Taudul
b6088b908f
Callstack capture for ZoneBegin.
2018-06-22 00:56:30 +02:00
Bartosz Taudul
bf7402e8b0
Android callstack collection using _Unwind_Backtrace().
2018-06-21 17:07:21 +02:00
Bartosz Taudul
0c13fb818b
Initialize rpmalloc in Mem{Alloc,Free}Callstack().
...
rpmalloc may still be uninitialized here (i.e. if memory allocation/free
is performed before any other tracy operation that would initialize
thread_local data). Since memory allocations are using serialized queue
(which is not held in thread_local section) and obtaining callstack
involves memory allocation, we need to initialize rpmalloc manually.
This won't be a problem when support for zone callbacks becomes online,
because zones are stored in per-thread queues, which initialize
thread_local data before rpmalloc is needed in the Callstack() call.
2018-06-21 17:02:40 +02:00
Bartosz Taudul
937141b7e3
Include symbol address in location field on linux.
2018-06-21 13:14:13 +02:00
Bartosz Taudul
b3ca36f3f4
Include symbol offset in symbol name on linux.
2018-06-21 13:10:48 +02:00
Bartosz Taudul
909166daf7
Hide SendCallstackMemory().
2018-06-20 23:30:19 +02:00
Bartosz Taudul
8c46ad81d5
Extract common code.
2018-06-20 23:29:44 +02:00
Bartosz Taudul
32278364cd
Demangle symbol names.
2018-06-20 23:01:00 +02:00
Bartosz Taudul
c8f51d7f11
More involved callstack frame description on linux.
2018-06-20 22:54:42 +02:00
Bartosz Taudul
36d81412a0
Fix copy pasta.
2018-06-20 22:27:46 +02:00
Bartosz Taudul
601c80466c
Fix use-after-free.
2018-06-20 22:18:12 +02:00
Bartosz Taudul
5541cd6c97
Linux callstack retrieval.
2018-06-20 21:54:11 +02:00
Bartosz Taudul
b4b08a0b29
Windows header poisoning should be avoided only in headers.
...
This fixes cygwin.
2018-06-20 21:01:25 +02:00
Bartosz Taudul
45cec65eef
Don't assign const char ptr to char ptr.
2018-06-20 20:35:57 +02:00
Bartosz Taudul
e495747b88
Fix off-by-one.
2018-06-20 17:02:05 +02:00
Bartosz Taudul
88b1955a5a
Filename in callstack frame is not a persistent pointer.
2018-06-20 01:26:05 +02:00
Bartosz Taudul
5177a7b960
Callstack frame transfer.
2018-06-20 01:06:31 +02:00
Bartosz Taudul
359feae7ef
Symbol retrieval may fail.
2018-06-20 01:05:44 +02:00
Bartosz Taudul
4be2543b2f
Cygwin support for callstack tracing.
2018-06-19 19:49:21 +02:00
Bartosz Taudul
9b1fb01e16
Disable Callstack() call if there's no callstack support.
2018-06-19 19:38:30 +02:00
Bartosz Taudul
0a8cd73db7
Issue predictive callback payload transfer.
2018-06-19 19:31:16 +02:00
Bartosz Taudul
51043ebc47
Callstack payload transfer.
2018-06-19 19:31:16 +02:00
Bartosz Taudul
55e6a4a484
No return status is needed here.
2018-06-19 19:00:57 +02:00
Bartosz Taudul
d0d3545988
Optional sending of callstack ptr in memory events.
2018-06-19 18:51:21 +02:00
Bartosz Taudul
d2a98c3090
Configurable callstack depth.
2018-06-19 18:49:13 +02:00
Bartosz Taudul
ca499eefaf
Return typeless pointer.
2018-06-19 17:27:03 +02:00
Bartosz Taudul
827900969f
Make Callstack() static inline.
2018-06-19 17:23:50 +02:00
Bartosz Taudul
ca2cac9b99
Use proper type for pointer size.
2018-06-19 14:34:37 +02:00
Bartosz Taudul
4a01eb7fc4
Windows callstack inspection plumbing.
2018-06-19 01:17:19 +02:00
Bartosz Taudul
7a23f677dd
Vulkan and OpenGL must share idx pool.
2018-06-18 01:10:43 +02:00
Bartosz Taudul
9c11e0fc5b
Vulkan tracing.
2018-06-17 18:14:37 +02:00
Bartosz Taudul
3432c594a9
ImplicitProducer is private.
2018-05-08 16:27:52 +02:00
Bartosz Taudul
e2534e2bf6
Forward declare explicit and implicit producers.
2018-05-08 12:33:19 +02:00
Bartosz Taudul
5b6d9769af
Properly separate HW timer from MSVC rdtscp optimization.
2018-04-27 19:40:47 +02:00
Bartosz Taudul
237aee30a8
Test if HW timer can be used on arm.
2018-04-27 16:58:45 +02:00
Bartosz Taudul
6a2311a7b7
Arm64 also defines __ARM_ARCH.
2018-04-26 17:39:04 +02:00
Bartosz Taudul
a3f5003f88
Read time from timer register on armv6, armv7.
...
Same improvement as on aarch64.
2018-04-26 17:18:10 +02:00
Bartosz Taudul
69a50b04c1
Really don't care about cpu id.
2018-04-26 16:12:52 +02:00
Bartosz Taudul
1899066e36
Read time from timer register on arm64.
...
On ODROID C2 this change improves timer resolution from 250 ns to 41 ns.
2018-04-26 16:03:31 +02:00
Bartosz Taudul
3a20104882
No need for separate tracy_rdtscp() function.
2018-04-26 15:30:53 +02:00
Bartosz Taudul
8cc9464082
Use GetTime() in CalibrateTimer().
2018-04-26 15:29:09 +02:00
Bartosz Taudul
48665cc09b
s/TRACY_RDTSCP_SUPPORTED/TRACY_HW_TIMER/
2018-04-26 15:25:54 +02:00
Bartosz Taudul
4eb205ad18
Optimize FastVector for fast push_next() operation.
2018-04-14 17:12:41 +02:00
Bartosz Taudul
15219b1481
Support 4-byte size_t.
2018-04-14 16:08:39 +02:00
Bartosz Taudul
459890ef0e
Don't hold lock on serial queue during dequeue.
2018-04-14 15:46:11 +02:00
Bartosz Taudul
e1dc62cabe
Add fast vector swap.
2018-04-14 15:46:01 +02:00
Bartosz Taudul
7c4075c9ce
Fix MemRead() call.
2018-04-03 17:57:12 +02:00
Bartosz Taudul
3ea5600900
Fix UB, lose type safety.
2018-04-03 17:51:53 +02:00
Bartosz Taudul
9c403d9cc2
GetTime() calls also must be serialized.
2018-04-01 21:07:33 +02:00
Bartosz Taudul
794f199bdc
Serial queue dequeuing.
2018-04-01 20:04:35 +02:00
Bartosz Taudul
860e0e1809
Store memory operations in the serial queue.
2018-04-01 19:53:24 +02:00
Bartosz Taudul
faeecdd773
Add serial queue to profiler.
2018-04-01 19:53:05 +02:00
Bartosz Taudul
0a3e9f85eb
"Fast" vector implementation.
2018-04-01 19:52:29 +02:00
Bartosz Taudul
991fc6bd95
Memory allocations tracker.
2018-03-31 21:56:05 +02:00
Bartosz Taudul
7a35e8facc
Fix typo.
2018-03-31 14:19:45 +02:00
Bartosz Taudul
a677048d2b
Fix try_lock().
2018-03-31 14:15:04 +02:00
Bartosz Taudul
3b03e849f0
Harden client code against unaligned memory access.
...
There shouldn't be any changes in generated code on modern
architectures, as the memcpy will be reduced to a store/load operation
identical to the one generated with plain struct member access.
GetTime( cpu ) needs special handling, as the MSVC intrinsic for rdtscp
can't store cpu identifier in a register. Using intermediate variable
would cause store to stack, read from stack, store to the destination
address. Since rdtscp is only available on x86, which handles unaligned
stores without any problems, we can have one place with direct struct
member access.
2018-03-31 14:15:04 +02:00
Bartosz Taudul
dca7338319
Update rpmalloc to 1.3.0.
2018-03-04 15:51:10 +01:00
Bartosz Taudul
0c1721144e
Backport concurrent queue's fixes.
...
420509b6678263f0fa6c0ffba87a15319238a1f2
2018-03-04 15:32:42 +01:00
Bartosz Taudul
7300c2e46e
Fix TRACY_NO_EXIT behavior.
...
Terminate event could be the first event that was sent. In such case
server immediately closed the connection, as there was no outstanding
data to receive. Fix by sending all data in the queue before sending
terminate event.
2018-01-11 13:45:13 +01:00
Bartosz Taudul
c3a32f9c35
Send lock type in LockWait/LockSharedWait events.
...
This will be needed for proper construction of LockMap on the server, in
case the LockAnnounce message hasn't arrived yet.
2017-12-17 18:30:34 +01:00
Bartosz Taudul
bcf2bf1c5c
Shared lock events (still using old functionality).
2017-12-10 22:04:49 +01:00
Bartosz Taudul
a9e14c8990
Add standard lock events to shared locking.
2017-12-10 21:56:19 +01:00
Bartosz Taudul
782231b048
Shared lockable skeleton.
2017-12-10 21:49:45 +01:00
Bartosz Taudul
3567d7edd8
Reintroduce lock announce events.
2017-12-10 21:40:48 +01:00
Bartosz Taudul
f67465e784
Reduce timer calibration delay to 200 ms.
2017-11-25 13:34:26 +01:00
Bartosz Taudul
48da593ab2
Increase calibration time to half a second.
2017-11-24 01:43:35 +01:00
Bartosz Taudul
c431747f06
Favor transfer of zones without predicted payload.
2017-11-22 02:28:12 +01:00
Bartosz Taudul
630db7112a
Leaner iteration in Profiler::Dequeue().
2017-11-22 02:07:23 +01:00
Bartosz Taudul
a309e71fe1
Move force inline defines to a separate header.
2017-11-19 16:32:38 +01:00
Bartosz Taudul
5da8a7aa9b
Optimize deque.
2017-11-15 20:20:02 +01:00
Bartosz Taudul
2f669aea41
Workaround gcc issues.
2017-11-15 10:56:27 +01:00
Bartosz Taudul
c43eb29ce0
Don't send source location pointer in query reply.
...
Since reply order is the same as the query order, the server already
knows what source location it receives. This observation allows placing
zone name into the source location struct.
2017-11-14 23:06:45 +01:00
Bartosz Taudul
5c872b2137
Simplify GPU context handling.
2017-11-14 00:48:26 +01:00
Bartosz Taudul
3c00ce0958
GPU context registration.
2017-11-11 19:44:09 +01:00
Bartosz Taudul
81735aea2f
Support for setting zone names in lua.
2017-11-11 17:56:41 +01:00
Bartosz Taudul
59ec40c045
Preemptive transfer of source location payload.
2017-11-11 15:59:30 +01:00
Bartosz Taudul
7f3b8f4647
Preemptive message text delivery.
2017-11-11 15:41:21 +01:00
Bartosz Taudul
76e11174dc
Preemptive sending of custom strings.
2017-11-11 15:22:55 +01:00
Bartosz Taudul
c2797a4cc7
Data packets can't cross data buffer boundary.
2017-11-11 15:08:03 +01:00
Bartosz Taudul
49bce256bc
Fix type mismatch.
2017-11-11 14:35:46 +01:00
Bartosz Taudul
0d15d45c3a
Don't send source location through the queue.
2017-11-11 14:24:22 +01:00
Bartosz Taudul
065964b216
Send data before sleeping during shutdown.
2017-11-11 14:23:55 +01:00