Bartosz Taudul
facb05f8cb
Don't mark FastVector element as used until it's ready.
...
This should prevent a race condition that would result in invalid last
element of the queue, in case a freezed thread already got the queue
item, but didn't wrote to it (or didn't wrote fully).
2018-08-20 22:35:50 +02:00
Bartosz Taudul
d1adf9e8d6
Allow skipping functions on top of call stack.
...
Note that this is on-client performance intensive and shouldn't be used,
except in special situations, like processing crashes.
2018-08-20 22:20:44 +02:00
Bartosz Taudul
b371003336
In case of manual shutdown, don't wait for lock.
...
All threads are freezed at this point, nothing will release it.
2018-08-20 21:49:23 +02:00
Bartosz Taudul
ca939ccd19
Allow external profiler shutdown requests.
2018-08-20 01:02:27 +02:00
Bartosz Taudul
9d051cf5ee
Add support for discontinuous frames.
2018-08-05 02:15:54 +02:00
Bartosz Taudul
adde6cf4fd
Allow sending named frames.
2018-08-04 15:04:18 +02:00
Bartosz Taudul
922882d3b0
Add name field to frame mark message.
2018-08-04 15:03:47 +02:00
Till Rathmann
37d5736bf5
Fixed compiler warnings.
2018-08-01 14:07:30 +02:00
Till Rathmann
dd042619e9
Support for multi-DLL projects.
2018-07-31 12:06:04 +02:00
Bartosz Taudul
31c2ddb8ac
Rename client's SourceLocation to SourceLocationData.
2018-07-28 00:34:04 +02:00
Bartosz Taudul
3737e122cf
Of course, this can't work without stupid fuckery.
2018-07-26 19:59:55 +02:00
Bartosz Taudul
561d2dc360
Use the fastest mutex available.
...
The selection is based on the following test results:
MSVC:
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.641 ns/iter
2 thread contention: 141.559 ns/iter
3 thread contention: 242.733 ns/iter
4 thread contention: 409.807 ns/iter
5 thread contention: 561.544 ns/iter
6 thread contention: 785.845 ns/iter
=> std::mutex
No contention: 19.190 ns/iter
2 thread contention: 39.305 ns/iter
3 thread contention: 58.999 ns/iter
4 thread contention: 59.532 ns/iter
5 thread contention: 103.539 ns/iter
6 thread contention: 110.314 ns/iter
=> std::shared_timed_mutex
No contention: 45.487 ns/iter
2 thread contention: 96.351 ns/iter
3 thread contention: 142.871 ns/iter
4 thread contention: 184.999 ns/iter
5 thread contention: 336.608 ns/iter
6 thread contention: 542.551 ns/iter
=> std::shared_mutex
No contention: 10.861 ns/iter
2 thread contention: 17.495 ns/iter
3 thread contention: 31.126 ns/iter
4 thread contention: 40.468 ns/iter
5 thread contention: 15.677 ns/iter
6 thread contention: 64.505 ns/iter
Cygwin (clang):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.536 ns/iter
2 thread contention: 121.082 ns/iter
3 thread contention: 396.430 ns/iter
4 thread contention: 672.555 ns/iter
5 thread contention: 1327.761 ns/iter
6 thread contention: 14151.955 ns/iter
=> std::mutex
No contention: 62.583 ns/iter
2 thread contention: 3990.464 ns/iter
3 thread contention: 7161.189 ns/iter
4 thread contention: 9870.820 ns/iter
5 thread contention: 12355.178 ns/iter
6 thread contention: 14694.903 ns/iter
=> std::shared_timed_mutex
No contention: 91.687 ns/iter
2 thread contention: 1115.037 ns/iter
3 thread contention: 4183.792 ns/iter
4 thread contention: 15283.491 ns/iter
5 thread contention: 27812.477 ns/iter
6 thread contention: 35028.140 ns/iter
=> std::shared_mutex
No contention: 91.764 ns/iter
2 thread contention: 1051.826 ns/iter
3 thread contention: 5574.720 ns/iter
4 thread contention: 15721.416 ns/iter
5 thread contention: 27721.487 ns/iter
6 thread contention: 35420.404 ns/iter
Linux (x64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 13.487 ns/iter
2 thread contention: 210.317 ns/iter
3 thread contention: 430.855 ns/iter
4 thread contention: 510.533 ns/iter
5 thread contention: 1003.609 ns/iter
6 thread contention: 1787.683 ns/iter
=> std::mutex
No contention: 12.403 ns/iter
2 thread contention: 157.122 ns/iter
3 thread contention: 186.791 ns/iter
4 thread contention: 265.073 ns/iter
5 thread contention: 283.778 ns/iter
6 thread contention: 270.687 ns/iter
=> std::shared_timed_mutex
No contention: 21.509 ns/iter
2 thread contention: 150.179 ns/iter
3 thread contention: 256.574 ns/iter
4 thread contention: 415.351 ns/iter
5 thread contention: 611.532 ns/iter
6 thread contention: 944.695 ns/iter
=> std::shared_mutex
No contention: 20.805 ns/iter
2 thread contention: 157.034 ns/iter
3 thread contention: 244.025 ns/iter
4 thread contention: 406.269 ns/iter
5 thread contention: 387.985 ns/iter
6 thread contention: 468.550 ns/iter
Linux (arm64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 20.891 ns/iter
2 thread contention: 211.037 ns/iter
3 thread contention: 409.962 ns/iter
4 thread contention: 657.441 ns/iter
5 thread contention: 828.405 ns/iter
6 thread contention: 1131.827 ns/iter
=> std::mutex
No contention: 50.884 ns/iter
2 thread contention: 103.620 ns/iter
3 thread contention: 332.429 ns/iter
4 thread contention: 620.802 ns/iter
5 thread contention: 783.943 ns/iter
6 thread contention: 834.002 ns/iter
=> std::shared_timed_mutex
No contention: 64.948 ns/iter
2 thread contention: 173.191 ns/iter
3 thread contention: 490.352 ns/iter
4 thread contention: 660.668 ns/iter
5 thread contention: 1014.546 ns/iter
6 thread contention: 1451.553 ns/iter
=> std::shared_mutex
No contention: 64.521 ns/iter
2 thread contention: 195.222 ns/iter
3 thread contention: 490.819 ns/iter
4 thread contention: 654.786 ns/iter
5 thread contention: 955.759 ns/iter
6 thread contention: 1282.544 ns/iter
2018-07-14 00:39:01 +02:00
Bartosz Taudul
e285c837a4
Support TRACY_NO_EXIT env variable in addition to define.
2018-07-13 23:55:40 +02:00
Bartosz Taudul
c3ba0ef4eb
Fix lua zone state init.
2018-07-13 20:21:50 +02:00
Bartosz Taudul
a3c898f8b8
Rename FrameMark() to SendFrameMark().
...
This avoids conflict with FrameMark define.
2018-07-13 20:09:19 +02:00
Arvid Gerstmann
6b87aecdce
Wrap concurrentqueue in tracy namespace
2018-07-13 20:01:27 +02:00
Bartosz Taudul
fbc5556ddd
Send memory events in on-demand mode.
2018-07-12 01:36:01 +02:00
Bartosz Taudul
475d151b2d
Implement deferring items.
2018-07-11 12:21:39 +02:00
Bartosz Taudul
52207f20b7
Add deferred events queue.
2018-07-11 12:14:28 +02:00
Bartosz Taudul
b1a71174db
Messages are also safe.
2018-07-10 23:09:59 +02:00
Bartosz Taudul
e80c677fa0
Plots can be safely sent in on-demand mode.
2018-07-10 23:06:27 +02:00
Bartosz Taudul
43d5ab4382
Count frames in on-demand mode.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
03794a2957
Send frame marks in on-demand mode.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
f8b2ffdc7e
Clear queues before new on-demand connection is made.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
c973735b49
Track connection status.
2018-07-10 22:27:19 +02:00
Bartosz Taudul
e5b133073c
Disable all tracing if TRACY_ON_DEMAND is defined.
2018-07-10 20:49:51 +02:00
Bartosz Taudul
4d197ec7a2
Unsafe version of AppendData.
2018-06-23 02:16:58 +02:00
Bartosz Taudul
f0ce7de193
Move callstack collection in mem events out of critical section.
2018-06-22 23:00:03 +02:00
Bartosz Taudul
b6088b908f
Callstack capture for ZoneBegin.
2018-06-22 00:56:30 +02:00
Bartosz Taudul
0c13fb818b
Initialize rpmalloc in Mem{Alloc,Free}Callstack().
...
rpmalloc may still be uninitialized here (i.e. if memory allocation/free
is performed before any other tracy operation that would initialize
thread_local data). Since memory allocations are using serialized queue
(which is not held in thread_local section) and obtaining callstack
involves memory allocation, we need to initialize rpmalloc manually.
This won't be a problem when support for zone callbacks becomes online,
because zones are stored in per-thread queues, which initialize
thread_local data before rpmalloc is needed in the Callstack() call.
2018-06-21 17:02:40 +02:00
Bartosz Taudul
909166daf7
Hide SendCallstackMemory().
2018-06-20 23:30:19 +02:00
Bartosz Taudul
8c46ad81d5
Extract common code.
2018-06-20 23:29:44 +02:00
Bartosz Taudul
5177a7b960
Callstack frame transfer.
2018-06-20 01:06:31 +02:00
Bartosz Taudul
9b1fb01e16
Disable Callstack() call if there's no callstack support.
2018-06-19 19:38:30 +02:00
Bartosz Taudul
51043ebc47
Callstack payload transfer.
2018-06-19 19:31:16 +02:00
Bartosz Taudul
55e6a4a484
No return status is needed here.
2018-06-19 19:00:57 +02:00
Bartosz Taudul
d0d3545988
Optional sending of callstack ptr in memory events.
2018-06-19 18:51:21 +02:00
Bartosz Taudul
9c11e0fc5b
Vulkan tracing.
2018-06-17 18:14:37 +02:00
Bartosz Taudul
5b6d9769af
Properly separate HW timer from MSVC rdtscp optimization.
2018-04-27 19:40:47 +02:00
Bartosz Taudul
237aee30a8
Test if HW timer can be used on arm.
2018-04-27 16:58:45 +02:00
Bartosz Taudul
6a2311a7b7
Arm64 also defines __ARM_ARCH.
2018-04-26 17:39:04 +02:00
Bartosz Taudul
a3f5003f88
Read time from timer register on armv6, armv7.
...
Same improvement as on aarch64.
2018-04-26 17:18:10 +02:00
Bartosz Taudul
69a50b04c1
Really don't care about cpu id.
2018-04-26 16:12:52 +02:00
Bartosz Taudul
1899066e36
Read time from timer register on arm64.
...
On ODROID C2 this change improves timer resolution from 250 ns to 41 ns.
2018-04-26 16:03:31 +02:00
Bartosz Taudul
3a20104882
No need for separate tracy_rdtscp() function.
2018-04-26 15:30:53 +02:00
Bartosz Taudul
48665cc09b
s/TRACY_RDTSCP_SUPPORTED/TRACY_HW_TIMER/
2018-04-26 15:25:54 +02:00
Bartosz Taudul
15219b1481
Support 4-byte size_t.
2018-04-14 16:08:39 +02:00
Bartosz Taudul
459890ef0e
Don't hold lock on serial queue during dequeue.
2018-04-14 15:46:11 +02:00
Bartosz Taudul
9c403d9cc2
GetTime() calls also must be serialized.
2018-04-01 21:07:33 +02:00
Bartosz Taudul
794f199bdc
Serial queue dequeuing.
2018-04-01 20:04:35 +02:00