Commit Graph

495 Commits

Author SHA1 Message Date
Bartosz Taudul
bb2d44ae08 All time deltas must be processed. 2019-11-07 16:14:23 +01:00
Bartosz Taudul
ea2c329510 Input data *must not* be changed.
Not even for a short moment.
2019-11-07 01:29:11 +01:00
Bartosz Taudul
4a4fe82a1b No need to inject string terminator.
Comparison in m_data.stringMap already takes string size into account,
as an charutil::StringKey optimization.
2019-11-07 01:28:29 +01:00
Bartosz Taudul
dfad9695d2 Compress frame image data right as it arrives.
This removes the need to store temporary uncompressed image buffers,
which involves constant memory allocation and freeing. Instead, just one
permanent buffer is used, and only because the input data cannot change
during processing.
2019-11-06 23:29:59 +01:00
Bartosz Taudul
46d33f45bf Frame image packer doesn't care about width and height. 2019-11-06 22:53:01 +01:00
Bartosz Taudul
10a3516099 Delete uncompressed frame image data. 2019-11-06 22:38:19 +01:00
Bartosz Taudul
df0e28a61f Remove more unneeded includes. 2019-11-06 01:37:58 +01:00
Bartosz Taudul
f53637891a Remove LZ4 include from TracyWorker.hpp. 2019-11-06 01:25:38 +01:00
Bartosz Taudul
661c4a417b Process and store plot value formatting. 2019-11-05 18:02:08 +01:00
Bartosz Taudul
a7a739eea9 Use precalculated context switch usage data. 2019-11-05 01:41:27 +01:00
Bartosz Taudul
51090e5fb9 Implement ctx switch usage reconstruction. 2019-11-05 01:28:44 +01:00
Bartosz Taudul
50b96c757e Context switch usage reconstruction skeleton. 2019-11-05 01:28:44 +01:00
Bartosz Taudul
d9c3238462 Save 2 bytes per PlotItem.
Memory savings:

android     2614 MB -> 2487 MB (95%)
chicken     1932 MB -> 1852 MB (95%)
mem         6067 MB -> 5747 MB (94%)
q3bsp-mt    5059 MB -> 5017 MB (99%)
q3bsp-st    1211 MB -> 1171 MB (96%)
2019-11-03 16:29:45 +01:00
Bartosz Taudul
4bc1588a5e Clear proper vector. 2019-11-02 16:57:18 +01:00
Bartosz Taudul
0df29d1e0b Use short ptr for source location payload data. 2019-11-02 16:54:12 +01:00
Bartosz Taudul
72efbe28ed Use short ptr for message data. 2019-11-02 16:54:12 +01:00
Bartosz Taudul
1e4022e05b Use proper comparison. 2019-11-02 16:54:12 +01:00
Bartosz Taudul
a40bbacb17 Use short ptr for CPU zone data. 2019-11-02 16:54:12 +01:00
Bartosz Taudul
cb20bf01f9 Use short ptr for GPU zone data. 2019-11-02 16:54:11 +01:00
Bartosz Taudul
c7664b0a98 Use short ptr in LockEventPtr. 2019-11-02 16:17:45 +01:00
Bartosz Taudul
45ff14d678 Fix saving source location payload data. 2019-11-02 14:28:59 +01:00
Bartosz Taudul
16bc862904 Save sizes of children vectors to prevent reallocation. 2019-11-02 12:38:32 +01:00
Bartosz Taudul
13b656fe61 Make srcloc dynamic color depend on function name. 2019-11-01 20:17:25 +01:00
Bartosz Taudul
39988ad636 Check for shutdown in background processing thread. 2019-10-31 21:41:21 +01:00
Bartosz Taudul
456deefdbc Keep child idx on stack. 2019-10-30 23:55:21 +01:00
Bartosz Taudul
25b610a36f Pack child into GPU start/end in GpuEvent (saves 4 bytes).
long    5152 MB -> 5061 MB
2019-10-30 23:50:37 +01:00
Bartosz Taudul
e8286600d1 Use -1 as invalid GPU start time. 2019-10-30 23:12:43 +01:00
Bartosz Taudul
7ce8c772ad Disallow negative GPU times.
Shouldn't happen, but GPU timestamps are a shitshow, so better be safe
than sorry.
2019-10-30 22:37:07 +01:00
Bartosz Taudul
ae4794ab4c Save 2 bytes in ContextSwitchData and ContextSwitchCpu. 2019-10-30 22:25:46 +01:00
Bartosz Taudul
99d198d0bf Pack csAlloc in MemEvent (saves 3 bytes).
Memory usage change on selected traces:

android     2699 MB -> 2613 MB
chicken     2019 MB -> 2007 MB
mem         6308 MB -> 6068 MB
q3bsp-mt    5283 MB -> 5252 MB
q3bsp-st    1241 MB -> 1211 MB
2019-10-30 22:01:13 +01:00
Bartosz Taudul
6f0dc2885f Fix connection abort. 2019-10-28 23:32:51 +01:00
Bartosz Taudul
8050622b0f Read and decompress network data on a separate thread. 2019-10-28 23:22:50 +01:00
Bartosz Taudul
e0356ae01e Cosmetics. 2019-10-28 22:53:06 +01:00
Bartosz Taudul
99b7e8ad92 Close socket when shutting down. 2019-10-28 22:52:52 +01:00
Bartosz Taudul
788ca2e5df Spawn no-op network thread. 2019-10-28 22:45:10 +01:00
Bartosz Taudul
7f07f5beb4 Free child time stack. 2019-10-26 23:32:16 +02:00
Bartosz Taudul
01985f50ef Cache source location zones counter search. 2019-10-26 16:33:40 +02:00
Bartosz Taudul
1d0084aa28 Add cache for last accessed source location zones. 2019-10-25 21:29:55 +02:00
Bartosz Taudul
b5419944aa Only write to memory if value has changed. 2019-10-25 21:28:55 +02:00
Bartosz Taudul
779063a18b Cache last shrinked source location. 2019-10-25 21:07:28 +02:00
Bartosz Taudul
294793367f Cache last CheckSourceLocation query.
Just knowing that the query was performed is enough here -- this
function adds a new source location entry, if there already isn't one.
2019-10-25 21:01:33 +02:00
Bartosz Taudul
0f2503d334 Send time deltas in GPU time events. 2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176 Send delta times for context switches. 2019-10-25 19:13:11 +02:00
Bartosz Taudul
29c42cc8d7 Fix assert. 2019-10-25 01:00:32 +02:00
Bartosz Taudul
17a51c898e No need to check if vector is empty. 2019-10-25 00:54:46 +02:00
Bartosz Taudul
b5e759bc5a Don't calculate child index twice. 2019-10-25 00:54:46 +02:00
Bartosz Taudul
70f1074490 Don't iterate over children to calculate zone self time. 2019-10-25 00:33:44 +02:00
Bartosz Taudul
1fe76be955 Don't reconstruct lock event time on insert. 2019-10-24 23:25:04 +02:00
Bartosz Taudul
b83d0f46d9 Improve updating last time.
Avoid LHS, don't write if don't need to.
2019-10-24 23:23:52 +02:00
Bartosz Taudul
721f3c8925 Callstack is already zero-initialized. 2019-10-24 23:05:39 +02:00
Bartosz Taudul
c9da5f1474 Use cached thread retriever. 2019-10-24 22:34:18 +02:00
Bartosz Taudul
5873561b54 Add cached thread retriever. 2019-10-24 22:33:48 +02:00
Bartosz Taudul
06bc802107 Avoid load-hit-store. 2019-10-24 22:24:00 +02:00
Bartosz Taudul
1cfb5adc44 Count transferred data size. 2019-10-24 00:47:16 +02:00
Bartosz Taudul
ba61a9ed84 Transfer time deltas, not absolute times.
This change significantly reduces network bandwidth requirements.

Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
5c92eae3ed Add early exit for invalid times. 2019-10-20 18:47:50 +02:00
Bartosz Taudul
4d761def61 Microoptimize comparison. 2019-10-16 20:26:39 +02:00
Bartosz Taudul
3ae5c125f6 Implement counting CPU usage (ctx switch) at a given time. 2019-10-15 16:54:43 +02:00
Bartosz Taudul
3ce6b1205f Don't iterate over 256 CPUs. 2019-10-15 16:13:53 +02:00
Bartosz Taudul
eccb0b1e4a Track max CPU present in context switch data. 2019-10-15 16:13:53 +02:00
Bartosz Taudul
bdb8516d04 Make sure context switch end time wasn't set already. 2019-10-15 14:54:28 +02:00
Bartosz Taudul
215dc8a804 More compact GpuEvent struct (save 4 bytes).
Memory usage reduction of various traces:

big         9011 -> 9007
frameimages 561  -> 552
fi-big      4144 -> 4139
long        5253 -> 5125
2019-10-13 14:42:52 +02:00
Bartosz Taudul
5e1894dd79 Count GPU zones. 2019-10-13 14:13:04 +02:00
Bartosz Taudul
65ea33a60f Store memory callstack data as 24-bit ints.
This reduces MemEvent size from 40 to 38 bytes.

Memory usage reduction:

chicken     2027 -> 2019
mem         6468 -> 6308
q3bsp-mt    5304 -> 5283
2019-10-01 22:38:17 +02:00
Bartosz Taudul
f0b957ec56 Store callstacks on 24 bits.
ZoneEvent is now 27 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9224 -> 9011  (97%)
chicken         2044 -> 2027  (99%)
drl-l-b         1443 -> 1383  (95%)
long            5327 -> 5253  (98%)
q3bsp-mt        5400 -> 5304  (98%)
selfprofile     1403 -> 1382  (98%)
2019-10-01 22:38:17 +02:00
Bartosz Taudul
717a212563 Save another 2 bytes per ZoneEvent.
ZoneEvent is not 28 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9527 -> 9224  (96%)
chicken         2107 -> 2044  (97%)
drl-l-b         1479 -> 1443  (97%)
long            5412 -> 5327  (98%)
q3bsp-mt        5592 -> 5400  (96%)
selfprofile     1443 -> 1403  (97%)
2019-10-01 01:05:37 +02:00
Bartosz Taudul
2470936050 Don't perform background tasks during trace upgrade. 2019-09-29 20:52:25 +02:00
Bartosz Taudul
d228bcb622 Pack StringIdx in 24 bits.
This reduces ZoneEvent size from 32 to 30 bytes.

Memory usage reduction on selected traces (sizes in MB):

big             9902 -> 9527  (96%)
chicken         2172 -> 2107  (97%)
ctx-big          311 ->  309  (99%)
drl-l-b         1570 -> 1479  (94%)
long            5496 -> 5412  (98%)
mem             6468 -> 6468  (100%)
q3bsp-mt        5784 -> 5592  (96%)
selfprofile     1486 -> 1443  (97%)
2019-09-29 20:32:42 +02:00
Aleksei Skriabin
c0c2f4536a strstr_nocase() typo fix. 2019-09-28 14:20:29 +05:00
Bartosz Taudul
a5ba74ed13 Handle multiple Vulkan threads. 2019-09-23 17:27:49 +02:00
Bartosz Taudul
82cd667b30 Allow specifying network port in server. 2019-09-21 15:43:01 +02:00
Bartosz Taudul
d8e0853cd8 Multithreaded frame image compression. 2019-09-20 23:03:12 +02:00
Bartosz Taudul
e1e5d6bd47 Add const version of PackFrameImage().
Temporary buffer needs to be handled outside of the function.
2019-09-20 22:55:55 +02:00
Bartosz Taudul
8fe9b56b6f Calculate frame statistics. 2019-09-16 22:02:47 +02:00
Bartosz Taudul
7673028dba Fix skipping memory data. 2019-09-16 15:42:25 +02:00
Bartosz Taudul
cdc4575dba Setup tid -> thread data mapping when loading trace. 2019-09-08 14:15:40 +02:00
Bartosz Taudul
ea6a0a58a7 Thread data accessor. 2019-09-08 14:07:16 +02:00
Bartosz Taudul
aac0a36a2d Don't use source location zones before they are ready. 2019-09-07 17:23:11 +02:00
Bartosz Taudul
86cb477811 Pack ZoneThreadData.
This reduces struct size from 10 to 8 bytes. Assumes 48-bit pointers
(4-level paging)!

Memory savings (MB):

android     2766    ->  2757    (99%)
big         10.29 G ->  9902    (96%)
chicken     2244    ->  2172    (96%)
ctx-android 228     ->  224     (98%)
drl-l-b     1635    ->  1570    (96%)
gn-vulkan   244     ->  240     (98%)
long        5656    ->  5496    (97%)
q3bsp-mt    6043    ->  5784    (95%)
selfprofile 1554    ->  1486    (95%)
2019-08-31 00:55:51 +02:00
Bartosz Taudul
3ec534cdf3 Prevent "ntdll.dll" from appearing as a thread name. 2019-08-30 23:09:07 +02:00
Bartosz Taudul
1c0c6311ec Fix skipping data when loading traces. 2019-08-30 01:16:42 +02:00
Bartosz Taudul
a2f968d843 Compress thread id in MessageData. 2019-08-28 21:03:01 +02:00
Bartosz Taudul
fd5014be6f GetThreadString() is no longer used. 2019-08-28 20:08:16 +02:00
Bartosz Taudul
3c092b4bec Add thread name getter combining local and external thread names. 2019-08-27 23:00:13 +02:00
Bartosz Taudul
f76f38777e Signed minus unsigned is unsigned... 2019-08-26 19:09:12 +02:00
Bartosz Taudul
1712431dfd Compress external threads. Saves 4 bytes per ctx switch.
Dropped support for loading context switch data in previous versions of
traces.
2019-08-19 23:09:58 +02:00
Bartosz Taudul
21e7a4bb16 Extract thread compression into a separate class. 2019-08-19 22:56:58 +02:00
Bartosz Taudul
94382f54ca Move FileVersion() to TracyFileHeader.hpp. 2019-08-19 22:56:58 +02:00
Bartosz Taudul
19857473e3 Also collect information on local threads. 2019-08-18 14:56:17 +02:00
Bartosz Taudul
3b8518f7b6 Save/load CPU thread data. 2019-08-18 01:53:38 +02:00
Bartosz Taudul
62dbe522c5 Add accessors. 2019-08-18 01:51:02 +02:00
Bartosz Taudul
103645c2fa Calculate cpu thread data statistics. 2019-08-18 01:50:49 +02:00
Bartosz Taudul
1498417a8d Save/load tid to pid mapping. 2019-08-17 22:36:21 +02:00
Bartosz Taudul
20e8a5ecc8 Create tid to pid mapping. 2019-08-17 22:32:41 +02:00
Bartosz Taudul
678e942e9f Transfer PID of profiled program. 2019-08-17 22:19:04 +02:00
Bartosz Taudul
414f903cc5 Collect thread wakeup data. 2019-08-17 17:05:29 +02:00
Bartosz Taudul
26be78530f Use signed number to calculate frame offset. 2019-08-17 15:22:54 +02:00
Bartosz Taudul
6c53cac15e Fix uninitialized variable. 2019-08-16 21:20:04 +02:00
Bartosz Taudul
e975c4d7bf Also retrieve external thread names. 2019-08-16 19:49:16 +02:00
Bartosz Taudul
ccaf92afc4 Save/load external process names. 2019-08-16 19:24:38 +02:00