Bartosz Taudul
db3e802643
Build reverse CPU topology map.
2019-11-29 22:46:57 +01:00
Bartosz Taudul
712403e9fd
Transfer, display, save CPU topology data.
2019-11-29 22:41:41 +01:00
Bartosz Taudul
4551553eb4
Implement setting client parameters from server.
2019-11-25 23:59:48 +01:00
Bartosz Taudul
fb6a92380d
Drop support for pre-v0.5 traces.
2019-11-21 21:48:35 +01:00
Bartosz Taudul
49945c7198
Process message callstacks.
2019-11-15 01:22:26 +01:00
Bartosz Taudul
b946c1d39e
Only enable magic fitted vectors in no-statistics builds.
...
Source location zones pointer fixup is just too slow to be feasible.
Note: no-statistics builds of the graphical profiler don't perform fixup
of view-related pointers (e.g. zone info window zone pointer). This
won't cause crashes, because the pointers are still valid, but the
displayed data will be incorrect and potentially changing in time, as
the pointer can be reused for completely other zone.
Memory usage of ToyPathTracer data, in various scenarios:
Capture + statistics: 7121 MB
Load + statistics: 6057 MB
Capture - statistics: 4876 MB
Load - statistics: 4521 MB
2019-11-11 00:20:33 +01:00
Bartosz Taudul
e1e3bbbe3e
Fixup source location zones pointers.
2019-11-11 00:20:33 +01:00
Bartosz Taudul
4f962d2fcc
Add ZoneEvent re-use pool.
2019-11-11 00:20:33 +01:00
Bartosz Taudul
d4a1168491
Messages are inserted for current thread context.
2019-11-10 17:23:04 +01:00
Bartosz Taudul
b1c88cd1f2
Cache ThreadData pointer for current thread context.
2019-11-10 17:17:07 +01:00
Bartosz Taudul
672093cf0e
Adapt WriteTimeline() to magic vectors.
2019-11-10 16:34:38 +01:00
Bartosz Taudul
f8edd3a37b
Zone statistics reconstructions has to use magic vectors.
2019-11-10 00:00:40 +01:00
Bartosz Taudul
ec895372b7
Thread is not needed in ReadTimeline().
2019-11-08 23:56:11 +01:00
Bartosz Taudul
6ec734c264
Split ReadTimelineUpdateStatistics().
2019-11-08 23:53:43 +01:00
Bartosz Taudul
ea2c329510
Input data *must not* be changed.
...
Not even for a short moment.
2019-11-07 01:29:11 +01:00
Bartosz Taudul
dfad9695d2
Compress frame image data right as it arrives.
...
This removes the need to store temporary uncompressed image buffers,
which involves constant memory allocation and freeing. Instead, just one
permanent buffer is used, and only because the input data cannot change
during processing.
2019-11-06 23:29:59 +01:00
Bartosz Taudul
46d33f45bf
Frame image packer doesn't care about width and height.
2019-11-06 22:53:01 +01:00
Bartosz Taudul
f53637891a
Remove LZ4 include from TracyWorker.hpp.
2019-11-06 01:25:38 +01:00
Bartosz Taudul
661c4a417b
Process and store plot value formatting.
2019-11-05 18:02:08 +01:00
Bartosz Taudul
50b96c757e
Context switch usage reconstruction skeleton.
2019-11-05 01:28:44 +01:00
Bartosz Taudul
0df29d1e0b
Use short ptr for source location payload data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
04c92f8d19
Use short ptr for callstack payload storage.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
b0e52f20f8
Use short ptr for FrameImage storage.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
72efbe28ed
Use short ptr for message data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
52062f96d0
Use short ptr for GPU context map.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
a40bbacb17
Use short ptr for CPU zone data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
cb20bf01f9
Use short ptr for GPU zone data.
2019-11-02 16:54:11 +01:00
Bartosz Taudul
16bc862904
Save sizes of children vectors to prevent reallocation.
2019-11-02 12:38:32 +01:00
Bartosz Taudul
25b610a36f
Pack child into GPU start/end in GpuEvent (saves 4 bytes).
...
long 5152 MB -> 5061 MB
2019-10-30 23:50:37 +01:00
Bartosz Taudul
789b95f259
Force inline small functions.
2019-10-29 01:32:09 +01:00
Bartosz Taudul
8050622b0f
Read and decompress network data on a separate thread.
2019-10-28 23:22:50 +01:00
Bartosz Taudul
788ca2e5df
Spawn no-op network thread.
2019-10-28 22:45:10 +01:00
Bartosz Taudul
01985f50ef
Cache source location zones counter search.
2019-10-26 16:33:40 +02:00
Bartosz Taudul
1d0084aa28
Add cache for last accessed source location zones.
2019-10-25 21:29:55 +02:00
Bartosz Taudul
779063a18b
Cache last shrinked source location.
2019-10-25 21:07:28 +02:00
Bartosz Taudul
294793367f
Cache last CheckSourceLocation query.
...
Just knowing that the query was performed is enough here -- this
function adds a new source location entry, if there already isn't one.
2019-10-25 21:01:33 +02:00
Bartosz Taudul
0f2503d334
Send time deltas in GPU time events.
2019-10-25 19:52:01 +02:00
Bartosz Taudul
1ce25d3aef
Init cache in-place.
2019-10-25 19:19:35 +02:00
Bartosz Taudul
8fa5188176
Send delta times for context switches.
2019-10-25 19:13:11 +02:00
Bartosz Taudul
c8e5489e99
Group caches together.
2019-10-25 18:16:27 +02:00
Bartosz Taudul
d6a8a8532f
Prevent storing variable on stack.
2019-10-24 23:40:21 +02:00
Bartosz Taudul
1fe76be955
Don't reconstruct lock event time on insert.
2019-10-24 23:25:04 +02:00
Bartosz Taudul
45332fd837
Don't read memory when setting values.
2019-10-24 23:03:13 +02:00
Bartosz Taudul
5873561b54
Add cached thread retriever.
2019-10-24 22:33:48 +02:00
Bartosz Taudul
1cfb5adc44
Count transferred data size.
2019-10-24 00:47:16 +02:00
Bartosz Taudul
ba61a9ed84
Transfer time deltas, not absolute times.
...
This change significantly reduces network bandwidth requirements.
Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
3ae5c125f6
Implement counting CPU usage (ctx switch) at a given time.
2019-10-15 16:54:43 +02:00
Bartosz Taudul
eccb0b1e4a
Track max CPU present in context switch data.
2019-10-15 16:13:53 +02:00
Bartosz Taudul
1ae49c14a2
GPU zone count accessor.
2019-10-13 14:13:28 +02:00
Bartosz Taudul
5e1894dd79
Count GPU zones.
2019-10-13 14:13:04 +02:00
Bartosz Taudul
f0b957ec56
Store callstacks on 24 bits.
...
ZoneEvent is now 27 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9224 -> 9011 (97%)
chicken 2044 -> 2027 (99%)
drl-l-b 1443 -> 1383 (95%)
long 5327 -> 5253 (98%)
q3bsp-mt 5400 -> 5304 (98%)
selfprofile 1403 -> 1382 (98%)
2019-10-01 22:38:17 +02:00
Bartosz Taudul
717a212563
Save another 2 bytes per ZoneEvent.
...
ZoneEvent is not 28 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9527 -> 9224 (96%)
chicken 2107 -> 2044 (97%)
drl-l-b 1479 -> 1443 (97%)
long 5412 -> 5327 (98%)
q3bsp-mt 5592 -> 5400 (96%)
selfprofile 1443 -> 1403 (97%)
2019-10-01 01:05:37 +02:00
Bartosz Taudul
2470936050
Don't perform background tasks during trace upgrade.
2019-09-29 20:52:25 +02:00
Bartosz Taudul
d228bcb622
Pack StringIdx in 24 bits.
...
This reduces ZoneEvent size from 32 to 30 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9902 -> 9527 (96%)
chicken 2172 -> 2107 (97%)
ctx-big 311 -> 309 (99%)
drl-l-b 1570 -> 1479 (94%)
long 5496 -> 5412 (98%)
mem 6468 -> 6468 (100%)
q3bsp-mt 5784 -> 5592 (96%)
selfprofile 1486 -> 1443 (97%)
2019-09-29 20:32:42 +02:00
Bartosz Taudul
c91ae667d1
Add string count getter.
2019-09-29 19:22:15 +02:00
Bartosz Taudul
82cd667b30
Allow specifying network port in server.
2019-09-21 15:43:01 +02:00
Bartosz Taudul
e1e5d6bd47
Add const version of PackFrameImage().
...
Temporary buffer needs to be handled outside of the function.
2019-09-20 22:55:55 +02:00
Bartosz Taudul
ea6a0a58a7
Thread data accessor.
2019-09-08 14:07:16 +02:00
Bartosz Taudul
86cb477811
Pack ZoneThreadData.
...
This reduces struct size from 10 to 8 bytes. Assumes 48-bit pointers
(4-level paging)!
Memory savings (MB):
android 2766 -> 2757 (99%)
big 10.29 G -> 9902 (96%)
chicken 2244 -> 2172 (96%)
ctx-android 228 -> 224 (98%)
drl-l-b 1635 -> 1570 (96%)
gn-vulkan 244 -> 240 (98%)
long 5656 -> 5496 (97%)
q3bsp-mt 6043 -> 5784 (95%)
selfprofile 1554 -> 1486 (95%)
2019-08-31 00:55:51 +02:00
Bartosz Taudul
fd5014be6f
GetThreadString() is no longer used.
2019-08-28 20:08:16 +02:00
Bartosz Taudul
3c092b4bec
Add thread name getter combining local and external thread names.
2019-08-27 23:00:13 +02:00
Bartosz Taudul
1712431dfd
Compress external threads. Saves 4 bytes per ctx switch.
...
Dropped support for loading context switch data in previous versions of
traces.
2019-08-19 23:09:58 +02:00
Bartosz Taudul
21e7a4bb16
Extract thread compression into a separate class.
2019-08-19 22:56:58 +02:00
Bartosz Taudul
62dbe522c5
Add accessors.
2019-08-18 01:51:02 +02:00
Bartosz Taudul
103645c2fa
Calculate cpu thread data statistics.
2019-08-18 01:50:49 +02:00
Bartosz Taudul
20e8a5ecc8
Create tid to pid mapping.
2019-08-17 22:32:41 +02:00
Bartosz Taudul
678e942e9f
Transfer PID of profiled program.
2019-08-17 22:19:04 +02:00
Bartosz Taudul
414f903cc5
Collect thread wakeup data.
2019-08-17 17:05:29 +02:00
Bartosz Taudul
e975c4d7bf
Also retrieve external thread names.
2019-08-16 19:49:16 +02:00
Bartosz Taudul
fe7f56b022
Implement retrieval of external process names.
2019-08-16 19:22:23 +02:00
Bartosz Taudul
c212661714
Allow determining whether thread is local to profiled program.
2019-08-16 17:59:25 +02:00
Bartosz Taudul
cef7e4b8d0
Save/load per-cpu context switches.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
8bc4258e29
Display count of per-cpu context switch data.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
a92034d59d
CPU data accessor.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
69527d2f71
Collect per-cpu context switch data.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
41beff29a9
Remove redundant GetTimeBegin().
...
Traces now start at zero time.
2019-08-15 21:04:20 +02:00
Bartosz Taudul
bf3ad57456
Pack start time and srcloc together in ZoneEvent.
...
This reduces ZoneEvent struct size by 2 bytes. Memory savings on various
captures:
10.62 GB -> 10.29 GB
2342 MB -> 2276 MB
1706 MB -> 1635 MB
6277 MB -> 6085 MB
2019-08-15 20:17:36 +02:00
Bartosz Taudul
b322d20c19
Store received timestamps offset to 0.
2019-08-15 20:17:36 +02:00
Bartosz Taudul
659907c972
Store srcloc identifiers using 16 bit.
...
This reduces various structure sizes by 2 bytes. Memory usage reduction
on various traces:
big 11 GB -> 10.62 GB
chicken 2436 MB -> 2342 MB
drl-light-big 1761 MB -> 1706 MB
q3bsp-mt 6469 MB -> 6277 MB
2019-08-15 20:15:48 +02:00
Bartosz Taudul
416113fdcb
Drop support for ETC1 frame images.
2019-08-15 16:29:50 +02:00
Bartosz Taudul
a194c93740
Allow checking if context switch data is available.
2019-08-14 20:26:55 +02:00
Bartosz Taudul
9a364fe5fe
Cache context switch data queries.
2019-08-14 20:16:11 +02:00
Bartosz Taudul
9417ad994d
Save/load context switch data.
2019-08-13 13:10:37 +02:00
Bartosz Taudul
8c494eabbf
Display number of context switch regions.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
0b03fed61c
Add context switch accessor.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
419f74280d
Store context switches.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
8aa0be39d5
Drop support for CPU id queries.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
154c902e03
Handle legacy file versions.
2019-08-12 12:36:37 +02:00
Bartosz Taudul
a76622d17a
Cache last searched ThreadData.
2019-08-03 14:35:01 +02:00
Bartosz Taudul
12969ee497
Track thread context.
...
This change exploits the fact that events are processed in batches
originating from a single thread. A single message changing thread
context is enough to handle multiple messages, as opposed to inclusion
of thread identifier in each message.
2019-08-02 20:18:08 +02:00
Bartosz Taudul
a4e7a341c0
Proper handling of disconnect request.
2019-08-01 23:14:09 +02:00
Bartosz Taudul
8c9d46ef29
Display application info in info window.
2019-07-12 18:39:07 +02:00
Bartosz Taudul
d64ab7db5a
Store app info messages.
2019-07-12 18:34:46 +02:00
Bartosz Taudul
10bcc8c770
Switch to DXT1 textures in profiler utility.
2019-06-27 19:14:51 +02:00
Bartosz Taudul
7dc7ece2bd
Add staging area for frame images.
...
Compressing frame images on a separate thread may cause frame image
arrival before frames are sent. Fix this issue by creating a staging
area in which frame images will wait for frames to arrive.
This probably breaks playback functionality, as non-existent frames may
be queried, but this problem seems to be very hard to find, so let's
ignore it for now.
2019-06-27 13:24:35 +02:00
Bartosz Taudul
8f7be5a0fa
Allow only 2^32-1 frame images.
2019-06-22 14:11:45 +02:00
Bartosz Taudul
6a82f666a7
Cosmetics.
2019-06-22 14:05:18 +02:00
Bartosz Taudul
31a4a45b14
Ignore memory free faults if running on apple.
...
There's a case in MoltenVK initialization where overloading operator new
and operator delete works for std::string destruction, but not
construction.
2019-06-13 14:15:17 +02:00
Bartosz Taudul
38b76ea32d
Add frame images vector accessor.
2019-06-12 00:14:44 +02:00
Bartosz Taudul
2c780f1af4
Allow sending immediate termination query from server.
2019-06-09 16:10:49 +02:00
Bartosz Taudul
50cda7720f
Handle frame image instrumentation failures.
2019-06-09 13:44:53 +02:00
Bartosz Taudul
bef1988800
Compress frame images using LZ4.
2019-06-08 12:17:18 +02:00
Bartosz Taudul
646e7327b8
Show loading progress of frame images.
2019-06-06 23:40:37 +02:00
Bartosz Taudul
f8a4909c96
Display number of frame images in a trace.
2019-06-06 23:21:36 +02:00
Bartosz Taudul
6b2741ccdb
Save/load frame images.
2019-06-06 23:08:19 +02:00
Bartosz Taudul
af56f41e32
Add frame image accessor.
2019-06-06 22:14:51 +02:00
Bartosz Taudul
e5bb6011c5
Frame image transfer prototype.
2019-06-06 21:39:54 +02:00
Bartosz Taudul
5681096486
Track status of worker background tasks.
2019-06-02 15:00:38 +02:00
Bartosz Taudul
845f3a2ddf
Use std::shared_mutex for locking worker access.
2019-05-28 19:21:53 +02:00
Bartosz Taudul
b3812146cb
Fix atomics initialization.
2019-05-27 14:09:55 +02:00
Bartosz Taudul
efc54babe3
Transfer of colored messages.
2019-05-10 20:17:44 +02:00
Bartosz Taudul
20e6813461
Store send queue size in mbps block.
2019-04-01 19:55:37 +02:00
Bartosz Taudul
57dff0abc9
Add server query queue.
2019-04-01 19:26:50 +02:00
Bartosz Taudul
c07c6d11b7
Define server query packet.
2019-04-01 19:21:53 +02:00
Bartosz Taudul
a632d9e2a3
Add zone vector cache.
...
Zone children will be now collected in staging vectors. When the zone is
ended (and no children can be added anymore to it), a size-fitted vector
is allocated using slab allocation. The over-allocated vector is then
put into cache for use in future zones.
This is only active for vectors <= 8192 elements, or 64 KB (chosen
arbitrarily), to reduce time spent on copying memory.
Overall, this change should have the following effects:
- System memory allocation pressure reduction, due to re-usage of
vectors, which eliminates the need for constant growth.
- Reduction of memory usage, because children vectors are now fitted to
required size.
- Slight increase of zone processing time, due to memory copying?
2019-03-26 22:06:00 +01:00
Bartosz Taudul
11f4dcbf1e
Consistent variable naming.
2019-03-26 21:41:44 +01:00
Bartosz Taudul
71e20e7e7f
Store lock map as flat_hash_map with pointer values.
2019-03-16 02:50:51 +01:00
Bartosz Taudul
a0299cc63a
Optimize calculation of standard deviation.
2019-03-14 01:23:37 +01:00
Bartosz Taudul
f57cac9042
Initialize SourceLocationZones in-place.
2019-03-14 01:15:19 +01:00
Bartosz Taudul
ebf09bebae
Only one callstack may be in-flight at any time.
...
Save for the allocated callstack, but this will be solved in another
way. There's no need to keep pending callstacks in a map.
2019-03-05 19:44:08 +01:00
Bartosz Taudul
3e81e44d75
Separate processing for allocated callstacks.
...
Only native callstack is used for the moment.
2019-03-05 02:42:50 +01:00
Bartosz Taudul
dc74297439
Add missing const qualifiers.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
bef31ba073
Separate message for zone begin with alloc src loc and callstack.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
66b8a13e77
Store callstack alloc payloads.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
9fc022346b
Replace frame pointers with callstack frame ids.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
664847211c
Pack/unpack frame pointer to callstack frame id.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
1feedb17ac
Add callstack frame identifier and the required plumbing.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
42c94f7142
Handle discontinuous frame end mismatch failure.
2019-02-28 19:32:42 +01:00
Bartosz Taudul
e9baa80bf3
Process CPU usage reports.
2019-02-21 22:56:59 +01:00
Bartosz Taudul
4422fce55c
Don't decompress GpuZone threads while saving trace.
...
Saving the threads compressed in GPU zones and memory event has the
following result on trace dump sizes:
043/aa.tracy (0.4.3) {10055 KB} -> 044/aa.tracy (0.4.4) {9975 KB} 99.20% size change
043/android.tracy (0.4.3) {542739 KB} -> 044/android.tracy (0.4.4) {519248 KB} 95.67% size change
043/asset-new.tracy (0.4.3) {78403 KB} -> 044/asset-new.tracy (0.4.4) {75899 KB} 96.81% size change
043/asset-new-id.tracy (0.4.3) {84341 KB} -> 044/asset-new-id.tracy (0.4.4) {81771 KB} 96.95% size change
043/asset-old.tracy (0.4.3) {80688 KB} -> 044/asset-old.tracy (0.4.4) {78410 KB} 97.18% size change
043/big.tracy (0.4.3) {939577 KB} -> 044/big.tracy (0.4.4) {938427 KB} 99.88% size change
043/callstack.tracy (0.4.3) {14557 KB} -> 044/callstack.tracy (0.4.4) {14465 KB} 99.37% size change
043/callstack-linux.tracy (0.4.3) {6949 KB} -> 044/callstack-linux.tracy (0.4.4) {6942 KB} 99.90% size change
043/crash.tracy (0.4.3) {131 KB} -> 044/crash.tracy (0.4.4) {127 KB} 97.10% size change
043/crash2.tracy (0.4.3) {1422 KB} -> 044/crash2.tracy (0.4.4) {1412 KB} 99.29% size change
043/darkrl.tracy (0.4.3) {15767 KB} -> 044/darkrl.tracy (0.4.4) {15663 KB} 99.34% size change
043/darkrl2.tracy (0.4.3) {7947 KB} -> 044/darkrl2.tracy (0.4.4) {7886 KB} 99.23% size change
043/darkrl-old.tracy (0.4.3) {67448 KB} -> 044/darkrl-old.tracy (0.4.4) {67004 KB} 99.34% size change
043/deadlock.tracy (0.4.3) {5984 KB} -> 044/deadlock.tracy (0.4.4) {5986 KB} 100.03% size change
043/gn-opengl.tracy (0.4.3) {29005 KB} -> 044/gn-opengl.tracy (0.4.4) {28885 KB} 99.59% size change
043/gn-vulkan.tracy (0.4.3) {29352 KB} -> 044/gn-vulkan.tracy (0.4.4) {29257 KB} 99.68% size change
043/long.tracy (0.4.3) {1182800 KB} -> 044/long.tracy (0.4.4) {1176584 KB} 99.47% size change
043/mem.tracy (0.4.3) {1369067 KB} -> 044/mem.tracy (0.4.4) {1262406 KB} 92.21% size change
043/multi.tracy (0.4.3) {8004 KB} -> 044/multi.tracy (0.4.4) {7944 KB} 99.24% size change
043/new.tracy (0.4.3) {1108 KB} -> 044/new.tracy (0.4.4) {1099 KB} 99.18% size change
043/q3bsp-mt.tracy (0.4.3) {949855 KB} -> 044/q3bsp-mt.tracy (0.4.4) {937574 KB} 98.71% size change
043/q3bsp-st.tracy (0.4.3) {240347 KB} -> 044/q3bsp-st.tracy (0.4.4) {230092 KB} 95.73% size change
043/selfprofile.tracy (0.4.3) {197708 KB} -> 044/selfprofile.tracy (0.4.4) {197659 KB} 99.98% size change
043/tbrowser.tracy (0.4.3) {9503 KB} -> 044/tbrowser.tracy (0.4.4) {9503 KB} 100.00% size change
043/test.tracy (0.4.3) {40700 KB} -> 044/test.tracy (0.4.4) {40699 KB} 100.00% size change
043/virtualfile_hc.tracy (0.4.3) {72424 KB} -> 044/virtualfile_hc.tracy (0.4.4) {72304 KB} 99.83% size change
043/zfile_hc.tracy (0.4.3) {39419 KB} -> 044/zfile_hc.tracy (0.4.4) {39328 KB} 99.77% size change
2019-02-17 00:29:01 +01:00
Bartosz Taudul
92c1420c30
Improve handling of post-load background jobs.
...
Background tasks (source location zones sorting, reconstruction of
memory plot) are now started only after trace loading is finished.
Multithreaded sorting was previously impacting trace load times.
Only one thread is used to perform both jobs, one after another.
2019-02-14 01:17:37 +01:00
Bartosz Taudul
631f81e9dc
Use Vector to store children data instead of std::vector.
2019-02-13 02:32:25 +01:00
Bartosz Taudul
08642d034b
Preserve string length in string map.
2019-02-12 22:11:15 +01:00
Bartosz Taudul
76186f3221
Allow zone name retrieval from source location.
2019-02-10 16:45:19 +01:00
Bartosz Taudul
922993cbbb
Add placeholder worker disconnect command.
2019-01-24 18:51:55 +01:00
Bartosz Taudul
42af2d14cc
Calculate self min and max times of source location zones.
2019-01-23 14:24:22 +01:00
Bartosz Taudul
ddad475c19
Make it possible to store multiple frames at single frame address.
2019-01-20 19:11:48 +01:00
Bartosz Taudul
92f3a4bba0
Add ZoneText and ZoneName to the C API.
2019-01-16 02:10:21 +01:00
Bartosz Taudul
49e270d8a6
Detect zone end without begin failure.
2019-01-16 00:45:48 +01:00
Bartosz Taudul
708fdfea49
Track memory alloc+free matching failures.
2019-01-15 18:56:26 +01:00
Bartosz Taudul
76ab70a948
Simplify failure detection code.
2019-01-15 18:55:47 +01:00
Bartosz Taudul
9944a73444
Store failure reason strings in Worker.
2019-01-15 18:42:15 +01:00
Bartosz Taudul
c3246ca3b5
Gracefully store failure states.
2019-01-14 23:22:31 +01:00
Bartosz Taudul
c3b67e4482
Perform zone stack validation.
2019-01-14 23:08:34 +01:00
Bartosz Taudul
dcc6bee607
Process zone validation messages.
2019-01-14 22:56:10 +01:00
Bartosz Taudul
fbe8eb3585
Fix initialization of atomics.
2019-01-06 21:09:56 +01:00
Bartosz Taudul
980c54e349
Track trace loading time.
2019-01-06 19:09:50 +01:00
Bartosz Taudul
5ac26ce084
Init common Worker variables in header.
2019-01-06 19:04:50 +01:00
Bartosz Taudul
a313ed4720
Track separate time offset for GPU times.
...
This is second version of 0.4.2 dump file format. Previous 0.4.2 format
cannot be read anymore.
041/aa.tracy (0.4.1) {18987 KB} -> 042/aa.tracy (0.4.2) {10051 KB} 52.94% size change
041/android.tracy (0.4.1) {696753 KB} -> 042/android.tracy (0.4.2) {542738 KB} 77.90% size change
041/asset-new.tracy (0.4.1) {97163 KB} -> 042/asset-new.tracy (0.4.2) {78402 KB} 80.69% size change
041/asset-new-id.tracy (0.4.1) {105683 KB} -> 042/asset-new-id.tracy (0.4.2) {84341 KB} 79.81% size change
041/asset-old.tracy (0.4.1) {100205 KB} -> 042/asset-old.tracy (0.4.2) {80688 KB} 80.52% size change
041/big.tracy (0.4.1) {2246014 KB} -> 042/big.tracy (0.4.2) {939578 KB} 41.83% size change
041/crash.tracy (0.4.1) {143 KB} -> 042/crash.tracy (0.4.2) {131 KB} 91.37% size change
041/crash2.tracy (0.4.1) {3411 KB} -> 042/crash2.tracy (0.4.2) {1420 KB} 41.63% size change
041/darkrl.tracy (0.4.1) {31818 KB} -> 042/darkrl.tracy (0.4.2) {15762 KB} 49.54% size change
041/darkrl2.tracy (0.4.1) {18778 KB} -> 042/darkrl2.tracy (0.4.2) {7945 KB} 42.31% size change
041/darkrl-old.tracy (0.4.1) {151346 KB} -> 042/darkrl-old.tracy (0.4.2) {67449 KB} 44.57% size change
041/deadlock.tracy (0.4.1) {53 KB} -> 042/deadlock.tracy (0.4.2) {52 KB} 98.55% size change
041/gn-opengl.tracy (0.4.1) {45860 KB} -> 042/gn-opengl.tracy (0.4.2) {29005 KB} 63.25% size change
041/gn-vulkan.tracy (0.4.1) {45618 KB} -> 042/gn-vulkan.tracy (0.4.2) {29352 KB} 64.34% size change
041/long.tracy (0.4.1) {1583550 KB} -> 042/long.tracy (0.4.2) {1182800 KB} 74.69% size change
041/mem.tracy (0.4.1) {1243058 KB} -> 042/mem.tracy (0.4.2) {1369067 KB} 110.14% size change
041/multi.tracy (0.4.1) {14519 KB} -> 042/multi.tracy (0.4.2) {8000 KB} 55.10% size change
041/new.tracy (0.4.1) {1439 KB} -> 042/new.tracy (0.4.2) {1105 KB} 76.75% size change
041/q3bsp-mt.tracy (0.4.1) {1414323 KB} -> 042/q3bsp-mt.tracy (0.4.2) {949855 KB} 67.16% size change
041/q3bsp-st.tracy (0.4.1) {301334 KB} -> 042/q3bsp-st.tracy (0.4.2) {240347 KB} 79.76% size change
041/selfprofile.tracy (0.4.1) {399648 KB} -> 042/selfprofile.tracy (0.4.2) {197704 KB} 49.47% size change
041/tbrowser.tracy (0.4.1) {13052 KB} -> 042/tbrowser.tracy (0.4.2) {9503 KB} 72.81% size change
041/test.tracy (0.4.1) {60309 KB} -> 042/test.tracy (0.4.2) {40700 KB} 67.49% size change
041/virtualfile_hc.tracy (0.4.1) {108967 KB} -> 042/virtualfile_hc.tracy (0.4.2) {72424 KB} 66.46% size change
041/zfile_hc.tracy (0.4.1) {58814 KB} -> 042/zfile_hc.tracy (0.4.2) {39418 KB} 67.02% size change
2019-01-03 21:52:43 +01:00
Bartosz Taudul
f8ef5b726a
Store time deltas, instead of absolute time in trace dumps.
...
This change greatly reduces the size of saved dumps, but increase the
cost of processing during loading. One notable outlier in the dataset
below is mem.tracy, which increased in size, even if changes in the
memory dump saving scheme decrease size of the other traces.
041/aa.tracy (0.4.1) {18987 KB} -> 042/aa.tracy (0.4.2) {10140 KB} 53.40% size change
041/android.tracy (0.4.1) {696753 KB} -> 042/android.tracy (0.4.2) {542738 KB} 77.90% size change
041/asset-new.tracy (0.4.1) {97163 KB} -> 042/asset-new.tracy (0.4.2) {78402 KB} 80.69% size change
041/asset-new-id.tracy (0.4.1) {105683 KB} -> 042/asset-new-id.tracy (0.4.2) {84341 KB} 79.81% size change
041/asset-old.tracy (0.4.1) {100205 KB} -> 042/asset-old.tracy (0.4.2) {80688 KB} 80.52% size change
041/big.tracy (0.4.1) {2246014 KB} -> 042/big.tracy (0.4.2) {943083 KB} 41.99% size change
041/crash.tracy (0.4.1) {143 KB} -> 042/crash.tracy (0.4.2) {131 KB} 91.39% size change
041/crash2.tracy (0.4.1) {3411 KB} -> 042/crash2.tracy (0.4.2) {1425 KB} 41.80% size change
041/darkrl.tracy (0.4.1) {31818 KB} -> 042/darkrl.tracy (0.4.2) {15897 KB} 49.96% size change
041/darkrl2.tracy (0.4.1) {18778 KB} -> 042/darkrl2.tracy (0.4.2) {8002 KB} 42.62% size change
041/darkrl-old.tracy (0.4.1) {151346 KB} -> 042/darkrl-old.tracy (0.4.2) {67945 KB} 44.89% size change
041/deadlock.tracy (0.4.1) {53 KB} -> 042/deadlock.tracy (0.4.2) {52 KB} 98.55% size change
041/gn-opengl.tracy (0.4.1) {45860 KB} -> 042/gn-opengl.tracy (0.4.2) {30983 KB} 67.56% size change
041/gn-vulkan.tracy (0.4.1) {45618 KB} -> 042/gn-vulkan.tracy (0.4.2) {31349 KB} 68.72% size change
041/long.tracy (0.4.1) {1583550 KB} -> 042/long.tracy (0.4.2) {1225316 KB} 77.38% size change
041/mem.tracy (0.4.1) {1243058 KB} -> 042/mem.tracy (0.4.2) {1369291 KB} 110.15% size change
041/multi.tracy (0.4.1) {14519 KB} -> 042/multi.tracy (0.4.2) {8110 KB} 55.86% size change
041/new.tracy (0.4.1) {1439 KB} -> 042/new.tracy (0.4.2) {1108 KB} 77.01% size change
041/q3bsp-mt.tracy (0.4.1) {1414323 KB} -> 042/q3bsp-mt.tracy (0.4.2) {949855 KB} 67.16% size change
041/q3bsp-st.tracy (0.4.1) {301334 KB} -> 042/q3bsp-st.tracy (0.4.2) {240347 KB} 79.76% size change
041/selfprofile.tracy (0.4.1) {399648 KB} -> 042/selfprofile.tracy (0.4.2) {197713 KB} 49.47% size change
041/tbrowser.tracy (0.4.1) {13052 KB} -> 042/tbrowser.tracy (0.4.2) {9503 KB} 72.81% size change
041/test.tracy (0.4.1) {60309 KB} -> 042/test.tracy (0.4.2) {40700 KB} 67.49% size change
041/virtualfile_hc.tracy (0.4.1) {108967 KB} -> 042/virtualfile_hc.tracy (0.4.2) {72839 KB} 66.85% size change
041/zfile_hc.tracy (0.4.1) {58814 KB} -> 042/zfile_hc.tracy (0.4.2) {39608 KB} 67.35% size change
2018-12-30 23:42:17 +01:00
Bartosz Taudul
a220f38fbd
Add support for matching source locations ignoring case.
2018-12-18 16:52:29 +01:00
Bartosz Taudul
f42d52923a
No-op processing of lock terminate events.
2018-12-16 20:46:33 +01:00
Bartosz Taudul
984a711666
Send protocol version to verify handshake.
2018-09-09 19:28:53 +02:00
Bartosz Taudul
9f4d6692dc
Proper way to get full frame count.
2018-09-01 12:38:12 +02:00
Bartosz Taudul
8f1acf2571
Store explicit program name and capture time.
2018-08-29 01:02:29 +02:00
Bartosz Taudul
619fba41ab
Display crash information in info window.
2018-08-20 02:23:55 +02:00
Bartosz Taudul
3b526b074e
Send crash report.
2018-08-20 02:23:55 +02:00
Bartosz Taudul
366ea35593
Allow crash event reporting.
...
When crash happens there's no longer anything to profile -- don't wait
for unfinished zones to finish before sending client terminate
confirmation.
2018-08-20 01:03:16 +02:00
Bartosz Taudul
71bfd15d9e
Display host info.
2018-08-19 18:24:43 +02:00
Bartosz Taudul
203d9b4b85
Store host info.
2018-08-19 18:21:56 +02:00
Bartosz Taudul
c2c0f887aa
Display srcloc, callstack counts.
2018-08-14 16:41:27 +02:00
Bartosz Taudul
a51da71fa4
Add lock, plot counts to worker.
2018-08-08 19:21:53 +02:00
Bartosz Taudul
9d051cf5ee
Add support for discontinuous frames.
2018-08-05 02:15:54 +02:00
Bartosz Taudul
83eac36949
Add FrameData vector accessor.
2018-08-04 21:10:45 +02:00
Bartosz Taudul
9b4348b497
Handle frame name queries.
2018-08-04 21:10:45 +02:00
Bartosz Taudul
5e9b2e36be
Make getting start of time less cryptic.
2018-08-04 21:10:45 +02:00
Bartosz Taudul
23dfc2e3fc
Multiple frame sets support.
2018-08-04 21:10:45 +02:00
Bartosz Taudul
ada9f78678
Use StringDiscovery for plots.
2018-08-04 16:33:03 +02:00
Bartosz Taudul
18896044c4
Display explicit names of loaded things.
2018-07-29 16:56:46 +02:00
Bartosz Taudul
9f13475b52
Track trace version in worker.
2018-07-29 15:33:48 +02:00
Bartosz Taudul
ccc5c37af5
Always count source location zones.
2018-07-29 14:16:13 +02:00
Bartosz Taudul
766bf45a2b
Fix initialization of atomics.
2018-07-28 20:13:06 +02:00
Bartosz Taudul
a14238c199
Add sub progress display.
2018-07-28 18:56:52 +02:00
Bartosz Taudul
0bf0ceed3d
Track trace loading progress.
2018-07-28 17:59:17 +02:00
Bartosz Taudul
7d7877517e
Also remove child vectors from GPU events.
2018-07-22 19:47:01 +02:00
Bartosz Taudul
3a934b2ba3
Store children vectors in a separate data collection.
...
This reduces per-zone memory cost by 9 bytes if there are no children
and increases it by 4 bytes, if there are children. This is universally
a better solution, as the following data shows:
+++ /home/wolf/desktop/tracy-old/android.tracy +++
Vectors: 2794480
Size 0: 2373070 (84.92%)
Size 1: 70237 (2.51%)
Size 2+: 351173 (12.57%)
+++ /home/wolf/desktop/tracy-old/asset-new.tracy +++
Vectors: 1799227
Size 0: 1482691 (82.41%)
Size 1: 93272 (5.18%)
Size 2+: 223264 (12.41%)
+++ /home/wolf/desktop/tracy-old/asset-new-id.tracy +++
Vectors: 1977996
Size 0: 1640817 (82.95%)
Size 1: 97198 (4.91%)
Size 2+: 239981 (12.13%)
+++ /home/wolf/desktop/tracy-old/asset-old.tracy +++
Vectors: 1782395
Size 0: 1471437 (82.55%)
Size 1: 88813 (4.98%)
Size 2+: 222145 (12.46%)
+++ /home/wolf/desktop/tracy-old/big.tracy +++
Vectors: 180794047
Size 0: 172696094 (95.52%)
Size 1: 2799772 (1.55%)
Size 2+: 5298181 (2.93%)
+++ /home/wolf/desktop/tracy-old/darkrl.tracy +++
Vectors: 12014129
Size 0: 11611324 (96.65%)
Size 1: 134980 (1.12%)
Size 2+: 267825 (2.23%)
+++ /home/wolf/desktop/tracy-old/mem.tracy +++
Vectors: 383097
Size 0: 321932 (84.03%)
Size 1: 854 (0.22%)
Size 2+: 60311 (15.74%)
+++ /home/wolf/desktop/tracy-old/new.tracy +++
Vectors: 77536
Size 0: 63035 (81.30%)
Size 1: 8886 (11.46%)
Size 2+: 5615 (7.24%)
+++ /home/wolf/desktop/tracy-old/selfprofile.tracy +++
Vectors: 22940871
Size 0: 22704868 (98.97%)
Size 1: 73000 (0.32%)
Size 2+: 163003 (0.71%)
+++ /home/wolf/desktop/tracy-old/tbrowser.tracy +++
Vectors: 962682
Size 0: 695380 (72.23%)
Size 1: 43007 (4.47%)
Size 2+: 224295 (23.30%)
+++ /home/wolf/desktop/tracy-old/virtualfile_hc.tracy +++
Vectors: 529170
Size 0: 449386 (84.92%)
Size 1: 15694 (2.97%)
Size 2+: 64090 (12.11%)
+++ /home/wolf/desktop/tracy-old/zfile_hc.tracy +++
Vectors: 264849
Size 0: 220589 (83.29%)
Size 1: 9386 (3.54%)
Size 2+: 34874 (13.17%)
2018-07-22 16:05:50 +02:00
Bartosz Taudul
9291a88020
Zones can be now also grouped by call stack.
2018-07-21 20:26:13 +02:00
Bartosz Taudul
561d2dc360
Use the fastest mutex available.
...
The selection is based on the following test results:
MSVC:
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.641 ns/iter
2 thread contention: 141.559 ns/iter
3 thread contention: 242.733 ns/iter
4 thread contention: 409.807 ns/iter
5 thread contention: 561.544 ns/iter
6 thread contention: 785.845 ns/iter
=> std::mutex
No contention: 19.190 ns/iter
2 thread contention: 39.305 ns/iter
3 thread contention: 58.999 ns/iter
4 thread contention: 59.532 ns/iter
5 thread contention: 103.539 ns/iter
6 thread contention: 110.314 ns/iter
=> std::shared_timed_mutex
No contention: 45.487 ns/iter
2 thread contention: 96.351 ns/iter
3 thread contention: 142.871 ns/iter
4 thread contention: 184.999 ns/iter
5 thread contention: 336.608 ns/iter
6 thread contention: 542.551 ns/iter
=> std::shared_mutex
No contention: 10.861 ns/iter
2 thread contention: 17.495 ns/iter
3 thread contention: 31.126 ns/iter
4 thread contention: 40.468 ns/iter
5 thread contention: 15.677 ns/iter
6 thread contention: 64.505 ns/iter
Cygwin (clang):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 11.536 ns/iter
2 thread contention: 121.082 ns/iter
3 thread contention: 396.430 ns/iter
4 thread contention: 672.555 ns/iter
5 thread contention: 1327.761 ns/iter
6 thread contention: 14151.955 ns/iter
=> std::mutex
No contention: 62.583 ns/iter
2 thread contention: 3990.464 ns/iter
3 thread contention: 7161.189 ns/iter
4 thread contention: 9870.820 ns/iter
5 thread contention: 12355.178 ns/iter
6 thread contention: 14694.903 ns/iter
=> std::shared_timed_mutex
No contention: 91.687 ns/iter
2 thread contention: 1115.037 ns/iter
3 thread contention: 4183.792 ns/iter
4 thread contention: 15283.491 ns/iter
5 thread contention: 27812.477 ns/iter
6 thread contention: 35028.140 ns/iter
=> std::shared_mutex
No contention: 91.764 ns/iter
2 thread contention: 1051.826 ns/iter
3 thread contention: 5574.720 ns/iter
4 thread contention: 15721.416 ns/iter
5 thread contention: 27721.487 ns/iter
6 thread contention: 35420.404 ns/iter
Linux (x64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 13.487 ns/iter
2 thread contention: 210.317 ns/iter
3 thread contention: 430.855 ns/iter
4 thread contention: 510.533 ns/iter
5 thread contention: 1003.609 ns/iter
6 thread contention: 1787.683 ns/iter
=> std::mutex
No contention: 12.403 ns/iter
2 thread contention: 157.122 ns/iter
3 thread contention: 186.791 ns/iter
4 thread contention: 265.073 ns/iter
5 thread contention: 283.778 ns/iter
6 thread contention: 270.687 ns/iter
=> std::shared_timed_mutex
No contention: 21.509 ns/iter
2 thread contention: 150.179 ns/iter
3 thread contention: 256.574 ns/iter
4 thread contention: 415.351 ns/iter
5 thread contention: 611.532 ns/iter
6 thread contention: 944.695 ns/iter
=> std::shared_mutex
No contention: 20.805 ns/iter
2 thread contention: 157.034 ns/iter
3 thread contention: 244.025 ns/iter
4 thread contention: 406.269 ns/iter
5 thread contention: 387.985 ns/iter
6 thread contention: 468.550 ns/iter
Linux (arm64):
=== Lock test, 6 threads ===
=> NonRecursiveBenaphore
No contention: 20.891 ns/iter
2 thread contention: 211.037 ns/iter
3 thread contention: 409.962 ns/iter
4 thread contention: 657.441 ns/iter
5 thread contention: 828.405 ns/iter
6 thread contention: 1131.827 ns/iter
=> std::mutex
No contention: 50.884 ns/iter
2 thread contention: 103.620 ns/iter
3 thread contention: 332.429 ns/iter
4 thread contention: 620.802 ns/iter
5 thread contention: 783.943 ns/iter
6 thread contention: 834.002 ns/iter
=> std::shared_timed_mutex
No contention: 64.948 ns/iter
2 thread contention: 173.191 ns/iter
3 thread contention: 490.352 ns/iter
4 thread contention: 660.668 ns/iter
5 thread contention: 1014.546 ns/iter
6 thread contention: 1451.553 ns/iter
=> std::shared_mutex
No contention: 64.521 ns/iter
2 thread contention: 195.222 ns/iter
3 thread contention: 490.819 ns/iter
4 thread contention: 654.786 ns/iter
5 thread contention: 955.759 ns/iter
6 thread contention: 1282.544 ns/iter
2018-07-14 00:39:01 +02:00
Bartosz Taudul
c8b5b9447d
Ignore dangling memory frees in on-demand mode.
2018-07-12 01:35:32 +02:00
Bartosz Taudul
e5064dec1e
Store on-demand connection state.
2018-07-12 01:21:04 +02:00
Bartosz Taudul
a78981e040
Store on-demand frame offset.
2018-07-10 22:42:00 +02:00
Bartosz Taudul
053284b1c7
Process custom free-form zone names.
2018-06-29 16:12:17 +02:00
Bartosz Taudul
865e8d8506
Extract zone name getting functionality.
2018-06-29 15:14:20 +02:00
Bartosz Taudul
b0aa13f4af
Callstack getters are const.
2018-06-24 16:15:49 +02:00
Bartosz Taudul
858628918b
Force inline AddCallstackPayload.
2018-06-24 15:28:09 +02:00
Bartosz Taudul
af0c64c888
Remove GPU resync support.
...
The whole concept is not really reliable. And it forces CPU to GPU sync,
which is bad.
2018-06-22 16:34:51 +02:00
Bartosz Taudul
cd5ca3e754
Don't use hash table to store 256 pointers.
2018-06-22 15:14:44 +02:00
Bartosz Taudul
55ddb64352
GPU context counter is now 8 bit.
2018-06-22 15:10:23 +02:00
Bartosz Taudul
35dc2f796e
Process GpuZoneBeginCallstack queue event.
2018-06-22 01:56:32 +02:00
Bartosz Taudul
4992ae6b39
Take callstack field in ZoneEvent into account in save/load.
2018-06-22 01:30:08 +02:00
Bartosz Taudul
5e01a8ead9
Process callstack queue event.
2018-06-22 01:15:49 +02:00
Bartosz Taudul
978e168cbd
Handle ZoneBeginCallstack queue event.
...
This is identical to ZoneBegin handling, but requires some additional
bookkeeping to account for the incoming callstack information.
2018-06-22 01:07:25 +02:00
Bartosz Taudul
973eab2b4a
Fix typo.
2018-06-20 23:42:00 +02:00
Bartosz Taudul
7912807133
Wait for transfer of pending callback frames.
2018-06-20 14:57:48 +02:00
Bartosz Taudul
4000f27e15
Stack frame accessor.
2018-06-20 01:18:59 +02:00
Bartosz Taudul
0c0afa5ac7
Process callstack frames.
2018-06-20 01:07:09 +02:00
Bartosz Taudul
203744cdd9
Callstack frame queries.
2018-06-20 00:25:26 +02:00
Bartosz Taudul
4eea85fdad
Callstack payload accessor.
2018-06-19 22:19:20 +02:00
Bartosz Taudul
06f34052a5
Have to track callstacks of both alloc and free.
2018-06-19 22:08:47 +02:00
Bartosz Taudul
c28465aa7c
Store unique callstack payloads.
2018-06-19 21:16:02 +02:00