Commit Graph

200 Commits

Author SHA1 Message Date
Bartosz Taudul
215dc8a804 More compact GpuEvent struct (save 4 bytes).
Memory usage reduction of various traces:

big         9011 -> 9007
frameimages 561  -> 552
fi-big      4144 -> 4139
long        5253 -> 5125
2019-10-13 14:42:52 +02:00
Bartosz Taudul
65ea33a60f Store memory callstack data as 24-bit ints.
This reduces MemEvent size from 40 to 38 bytes.

Memory usage reduction:

chicken     2027 -> 2019
mem         6468 -> 6308
q3bsp-mt    5304 -> 5283
2019-10-01 22:38:17 +02:00
Bartosz Taudul
f0b957ec56 Store callstacks on 24 bits.
ZoneEvent is now 27 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9224 -> 9011  (97%)
chicken         2044 -> 2027  (99%)
drl-l-b         1443 -> 1383  (95%)
long            5327 -> 5253  (98%)
q3bsp-mt        5400 -> 5304  (98%)
selfprofile     1403 -> 1382  (98%)
2019-10-01 22:38:17 +02:00
Bartosz Taudul
c631e33f81 Add 24-bit int implementation. 2019-10-01 21:48:34 +02:00
Bartosz Taudul
472959b29f Remove irrelevant comment. 2019-10-01 01:15:43 +02:00
Bartosz Taudul
717a212563 Save another 2 bytes per ZoneEvent.
ZoneEvent is not 28 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9527 -> 9224  (96%)
chicken         2107 -> 2044  (97%)
drl-l-b         1479 -> 1443  (97%)
long            5412 -> 5327  (98%)
q3bsp-mt        5592 -> 5400  (96%)
selfprofile     1443 -> 1403  (97%)
2019-10-01 01:05:37 +02:00
Bartosz Taudul
4964aa9547 Assert on getting index only for active strings. 2019-10-01 00:40:58 +02:00
Bartosz Taudul
d228bcb622 Pack StringIdx in 24 bits.
This reduces ZoneEvent size from 32 to 30 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9902 -> 9527  (96%)
chicken         2172 -> 2107  (97%)
ctx-big          311 ->  309  (99%)
drl-l-b         1570 -> 1479  (94%)
long            5496 -> 5412  (98%)
mem             6468 -> 6468  (100%)
q3bsp-mt        5784 -> 5592  (96%)
selfprofile     1486 -> 1443  (97%)
2019-09-29 20:32:42 +02:00
Bartosz Taudul
a5ba74ed13 Handle multiple Vulkan threads. 2019-09-23 17:27:49 +02:00
Bartosz Taudul
8fe9b56b6f Calculate frame statistics. 2019-09-16 22:02:47 +02:00
Bartosz Taudul
19f8f9f101 Use proper type. 2019-08-30 00:56:11 +02:00
Bartosz Taudul
a8d204821e Signed left shift is undefined. 2019-08-29 18:42:29 +02:00
Bartosz Taudul
a2f968d843 Compress thread id in MessageData. 2019-08-28 21:03:01 +02:00
Bartosz Taudul
1712431dfd Compress external threads. Saves 4 bytes per ctx switch.
Dropped support for loading context switch data in previous versions of
traces.
2019-08-19 23:09:58 +02:00
Bartosz Taudul
3b8518f7b6 Save/load CPU thread data. 2019-08-18 01:53:38 +02:00
Bartosz Taudul
103645c2fa Calculate cpu thread data statistics. 2019-08-18 01:50:49 +02:00
Bartosz Taudul
414f903cc5 Collect thread wakeup data. 2019-08-17 17:05:29 +02:00
Bartosz Taudul
f957f64ce1 No magic numbers. 2019-08-17 16:26:59 +02:00
Bartosz Taudul
69527d2f71 Collect per-cpu context switch data. 2019-08-16 16:51:18 +02:00
Bartosz Taudul
889eddd646 Pack ContextSwitchData. Saves 3 bytes per context switch region. 2019-08-15 23:53:47 +02:00
Bartosz Taudul
c22c259a13 Pack time and thread in MemEvent.
This saves 4 bytes per logged memory allocation. Memory savings for
selected traces:

android     2945 MB -> 2766 MB
chicken     2261 MB -> 2245 MB
q3bsp-mt    6085 MB -> 6043 MB
mem         6788 MB -> 6468 MB
2019-08-15 23:02:43 +02:00
Bartosz Taudul
e43a57f6b3 Remove irrelevant comments. 2019-08-15 21:51:47 +02:00
Bartosz Taudul
a635e54a79 Pack MessageData. 2019-08-15 21:42:24 +02:00
Bartosz Taudul
04c8830f86 Cosmetics. 2019-08-15 21:38:00 +02:00
Bartosz Taudul
45401fc54c Use proper variable name. 2019-08-15 21:34:19 +02:00
Bartosz Taudul
c9d7b96c81 Prevent int16_t -> int64_t promotion on negative numbers. 2019-08-15 20:58:16 +02:00
Bartosz Taudul
5e20b3f28a Pack time and source location in LockEvent. 2019-08-15 20:39:16 +02:00
Bartosz Taudul
bf3ad57456 Pack start time and srcloc together in ZoneEvent.
This reduces ZoneEvent struct size by 2 bytes. Memory savings on various
captures:

10.62 GB -> 10.29 GB
 2342 MB ->  2276 MB
 1706 MB ->  1635 MB
 6277 MB ->  6085 MB
2019-08-15 20:17:36 +02:00
Bartosz Taudul
659907c972 Store srcloc identifiers using 16 bit.
This reduces various structure sizes by 2 bytes. Memory usage reduction
on various traces:

big               11 GB -> 10.62 GB
chicken         2436 MB ->  2342 MB
drl-light-big   1761 MB ->  1706 MB
q3bsp-mt        6469 MB ->  6277 MB
2019-08-15 20:15:48 +02:00
Bartosz Taudul
32c7d13159 Count size of some more structures. 2019-08-15 14:15:40 +02:00
Bartosz Taudul
3e01ca3269 Calculate how long thread was in running time. 2019-08-14 17:12:48 +02:00
Bartosz Taudul
0bb0c10e3c Revert "Save one byte on ContextSwitchData."
Counting bits is hard, let's go shopping.
2019-08-14 13:55:05 +02:00
Bartosz Taudul
f285e0f5cc Save one byte on ContextSwitchData. 2019-08-13 15:16:46 +02:00
Bartosz Taudul
419f74280d Store context switches. 2019-08-13 02:35:32 +02:00
Bartosz Taudul
8aa0be39d5 Drop support for CPU id queries. 2019-08-12 23:05:34 +02:00
Bartosz Taudul
de953bfaa8 Use proper data type for callstack storage in GPU zones. 2019-06-22 14:04:27 +02:00
Bartosz Taudul
37d1457b44 Frame image may need flipping. 2019-06-12 15:28:32 +02:00
Bartosz Taudul
eb6ac5e6e1 Store frame reference in frame images. 2019-06-12 00:55:02 +02:00
Bartosz Taudul
bef1988800 Compress frame images using LZ4. 2019-06-08 12:17:18 +02:00
Bartosz Taudul
34b84bb284 Add frame image index to frame data. 2019-06-06 21:44:48 +02:00
Bartosz Taudul
e5bb6011c5 Frame image transfer prototype. 2019-06-06 21:39:54 +02:00
Bartosz Taudul
0da1e8551f Track lock contention status. 2019-05-12 16:17:17 +02:00
Bartosz Taudul
4850e19ebd Store color in message data. 2019-05-10 20:26:27 +02:00
Bartosz Taudul
a7886cf82c Replace linear search with hash lookup. 2019-04-03 16:24:16 +02:00
Bartosz Taudul
7e6a8135df Remove double indirection in GetNextLockEvent(). 2019-03-16 14:18:43 +01:00
Bartosz Taudul
4d66317bc3 Add per-thread time ranges to lock maps. 2019-03-16 02:50:51 +01:00
Bartosz Taudul
9fc022346b Replace frame pointers with callstack frame ids. 2019-03-03 18:05:03 +01:00
Bartosz Taudul
1feedb17ac Add callstack frame identifier and the required plumbing. 2019-03-03 18:05:03 +01:00
Bartosz Taudul
e190faa7e1 Save/load CPU usage plot. 2019-02-21 22:56:59 +01:00
Bartosz Taudul
e9baa80bf3 Process CPU usage reports. 2019-02-21 22:56:59 +01:00
Bartosz Taudul
b945f83169 Don't separate inclusive/exclusive counts.
There is no way for one frame to have both. Coloring is preserved and is
now determined by presence of children.
2019-02-06 22:36:21 +01:00
Bartosz Taudul
ddad475c19 Make it possible to store multiple frames at single frame address. 2019-01-20 19:11:48 +01:00
Bartosz Taudul
dcc6bee607 Process zone validation messages. 2019-01-14 22:56:10 +01:00
Bartosz Taudul
9360df89b1 Store announce and terminate time of locks. 2018-12-16 21:07:26 +01:00
Bartosz Taudul
9301986bae Collect callstacks for each entry in call stack tree. 2018-09-27 22:56:44 +02:00
Bartosz Taudul
3b526b074e Send crash report. 2018-08-20 02:23:55 +02:00
Bartosz Taudul
df14cf5330 Implement callstack tree of memory allocations. 2018-08-14 18:37:06 +02:00
Bartosz Taudul
9d051cf5ee Add support for discontinuous frames. 2018-08-05 02:15:54 +02:00
Bartosz Taudul
23dfc2e3fc Multiple frame sets support. 2018-08-04 21:10:45 +02:00
Bartosz Taudul
7d7877517e Also remove child vectors from GPU events. 2018-07-22 19:47:01 +02:00
Bartosz Taudul
3a934b2ba3 Store children vectors in a separate data collection.
This reduces per-zone memory cost by 9 bytes if there are no children
and increases it by 4 bytes, if there are children. This is universally
a better solution, as the following data shows:

+++ /home/wolf/desktop/tracy-old/android.tracy +++
Vectors: 2794480
Size 0: 2373070 (84.92%)
Size 1: 70237 (2.51%)
Size 2+: 351173 (12.57%)
+++ /home/wolf/desktop/tracy-old/asset-new.tracy +++
Vectors: 1799227
Size 0: 1482691 (82.41%)
Size 1: 93272 (5.18%)
Size 2+: 223264 (12.41%)
+++ /home/wolf/desktop/tracy-old/asset-new-id.tracy +++
Vectors: 1977996
Size 0: 1640817 (82.95%)
Size 1: 97198 (4.91%)
Size 2+: 239981 (12.13%)
+++ /home/wolf/desktop/tracy-old/asset-old.tracy +++
Vectors: 1782395
Size 0: 1471437 (82.55%)
Size 1: 88813 (4.98%)
Size 2+: 222145 (12.46%)
+++ /home/wolf/desktop/tracy-old/big.tracy +++
Vectors: 180794047
Size 0: 172696094 (95.52%)
Size 1: 2799772 (1.55%)
Size 2+: 5298181 (2.93%)
+++ /home/wolf/desktop/tracy-old/darkrl.tracy +++
Vectors: 12014129
Size 0: 11611324 (96.65%)
Size 1: 134980 (1.12%)
Size 2+: 267825 (2.23%)
+++ /home/wolf/desktop/tracy-old/mem.tracy +++
Vectors: 383097
Size 0: 321932 (84.03%)
Size 1: 854 (0.22%)
Size 2+: 60311 (15.74%)
+++ /home/wolf/desktop/tracy-old/new.tracy +++
Vectors: 77536
Size 0: 63035 (81.30%)
Size 1: 8886 (11.46%)
Size 2+: 5615 (7.24%)
+++ /home/wolf/desktop/tracy-old/selfprofile.tracy +++
Vectors: 22940871
Size 0: 22704868 (98.97%)
Size 1: 73000 (0.32%)
Size 2+: 163003 (0.71%)
+++ /home/wolf/desktop/tracy-old/tbrowser.tracy +++
Vectors: 962682
Size 0: 695380 (72.23%)
Size 1: 43007 (4.47%)
Size 2+: 224295 (23.30%)
+++ /home/wolf/desktop/tracy-old/virtualfile_hc.tracy +++
Vectors: 529170
Size 0: 449386 (84.92%)
Size 1: 15694 (2.97%)
Size 2+: 64090 (12.11%)
+++ /home/wolf/desktop/tracy-old/zfile_hc.tracy +++
Vectors: 264849
Size 0: 220589 (83.29%)
Size 1: 9386 (3.54%)
Size 2+: 34874 (13.17%)
2018-07-22 16:05:50 +02:00
Bartosz Taudul
053284b1c7 Process custom free-form zone names. 2018-06-29 16:12:17 +02:00
Bartosz Taudul
4a467b6d03 Remove GPU resync leftovers. 2018-06-28 00:48:23 +02:00
Bartosz Taudul
11cf650be6 Fix GPU queries ordering.
With multithreaded Vulkan rendering it is possible that GPU time queries
will be sent in a different order than the originating CPU queries were
made. This commit changes the in-order queue to a map of queries,
waiting to be resolved.
2018-06-22 16:37:54 +02:00
Bartosz Taudul
35dc2f796e Process GpuZoneBeginCallstack queue event. 2018-06-22 01:56:32 +02:00
Bartosz Taudul
205a4e4ca2 Add callstack index to ZoneEvent. 2018-06-22 01:11:03 +02:00
Bartosz Taudul
2a618c90d5 Properly save compressed thread in GPU events. 2018-06-20 23:12:49 +02:00
Bartosz Taudul
88b1955a5a Filename in callstack frame is not a persistent pointer. 2018-06-20 01:26:05 +02:00
Bartosz Taudul
203744cdd9 Callstack frame queries. 2018-06-20 00:25:26 +02:00
Bartosz Taudul
06f34052a5 Have to track callstacks of both alloc and free. 2018-06-19 22:08:47 +02:00
Bartosz Taudul
e03493f082 Store callstack index as uint32_t. 2018-06-19 21:39:22 +02:00
Bartosz Taudul
59dc55002b Callstack ptr in server data structures.
Will be probably reduced to 32-bit index later on.
2018-06-19 18:52:10 +02:00
Bartosz Taudul
bb0631585c Store thread id of GPU events. 2018-06-17 19:07:07 +02:00
Bartosz Taudul
dcd6cac078 Save GPU timestamp period.
Bump file version to 0.3.2.
2018-06-17 18:27:42 +02:00
Bartosz Taudul
53aea660c8 Store thread id in MessageData. 2018-05-25 21:10:38 +02:00
Bartosz Taudul
b18841aa75 Store ordered list of memory frees. 2018-05-02 17:59:50 +02:00
Bartosz Taudul
bc84ebc338 Read/write LockEvent data in one go. 2018-04-29 03:41:58 +02:00
Bartosz Taudul
9769cc4d7d Read/write most of MemEvent in one go. 2018-04-29 03:41:58 +02:00
Bartosz Taudul
d8bfe7de2e Create memory plot based on memory alloc/free events. 2018-04-28 15:49:12 +02:00
Bartosz Taudul
cd34ed6968 Two plot types: user and memory.
Only user plots are saved in a dump file.
2018-04-28 15:48:05 +02:00
Bartosz Taudul
bf99bff87d Store MemEvents directly in the vector. 2018-04-03 14:17:51 +02:00
Bartosz Taudul
52f59c90bf Track memory usage. 2018-04-02 00:00:49 +02:00
Bartosz Taudul
a574f98f0c Memory events are now serialized. 2018-04-01 20:13:01 +02:00
Bartosz Taudul
b12375815c Broken memory events processing. 2018-04-01 02:03:34 +02:00
Bartosz Taudul
fe6c753f12 Store lock thread map in flat hash map. 2018-03-20 15:40:25 +01:00
Bartosz Taudul
9dfa9c95cb Read and write whole ZoneEvent/GpuEvent data at once. 2018-03-15 21:59:16 +01:00
Bartosz Taudul
5cb917e868 No nonsense union. 2018-03-04 17:52:51 +01:00
Bartosz Taudul
5afdccfc46 Properly initialize data.
Unused bitbield bits and inactive string index/reference had thrash
values in release builds, which prevented de-duplication of source
location payloads.
2018-03-04 17:47:26 +01:00
Bartosz Szreder
9e3f18a62a Split data handling code from the view. 2018-02-21 16:41:37 +01:00
Bartosz Szreder
d5fe006e2d Add missing include charutil::hash() 2018-02-12 19:07:55 +01:00
Bartosz Taudul
d0d3bb1141 Store shared lock bits only for shared locks. 2017-12-17 18:44:31 +01:00
Bartosz Taudul
340506406e Shared lock state machine. 2017-12-10 23:30:13 +01:00
Bartosz Taudul
b07718ab9e Track list of shared locks. 2017-12-10 22:42:39 +01:00
Bartosz Taudul
398eecbb94 Store LockEvent type as an enum class. 2017-12-10 22:37:56 +01:00
Bartosz Taudul
3567d7edd8 Reintroduce lock announce events. 2017-12-10 21:40:48 +01:00
Bartosz Taudul
981bbfe42d Reorder LockEvent fields. 2017-12-09 19:13:59 +01:00
Bartosz Taudul
48678b3bd7 Drop bitfield usage. 2017-12-05 22:34:48 +01:00
Bartosz Taudul
081087b9ce Drop an indirection level in plots. 2017-12-05 21:24:09 +01:00
Bartosz Taudul
5246098c79 GPU context hiding plumbing. 2017-11-30 15:31:31 +01:00
Bartosz Taudul
a515bf8878 Perform GPU to CPU resynchronization on each collect event. 2017-11-25 13:33:57 +01:00