Bartosz Taudul
b946c1d39e
Only enable magic fitted vectors in no-statistics builds.
...
Source location zones pointer fixup is just too slow to be feasible.
Note: no-statistics builds of the graphical profiler don't perform fixup
of view-related pointers (e.g. zone info window zone pointer). This
won't cause crashes, because the pointers are still valid, but the
displayed data will be incorrect and potentially changing in time, as
the pointer can be reused for completely other zone.
Memory usage of ToyPathTracer data, in various scenarios:
Capture + statistics: 7121 MB
Load + statistics: 6057 MB
Capture - statistics: 4876 MB
Load - statistics: 4521 MB
2019-11-11 00:20:33 +01:00
Bartosz Taudul
e1e3bbbe3e
Fixup source location zones pointers.
2019-11-11 00:20:33 +01:00
Bartosz Taudul
ae33aa4869
Fitted zone vectors are now magic vectors.
...
The pointed-to zones in the original children vector can't be freed, so
they are put into a zone pool for re-use by future zones.
2019-11-11 00:20:33 +01:00
Bartosz Taudul
4f962d2fcc
Add ZoneEvent re-use pool.
2019-11-11 00:20:33 +01:00
Bartosz Taudul
f2801491bf
Don't copy back pointer.
2019-11-10 17:48:54 +01:00
Bartosz Taudul
d4a1168491
Messages are inserted for current thread context.
2019-11-10 17:23:04 +01:00
Bartosz Taudul
003bed573c
Use ThreadData cache in zone validation.
2019-11-10 17:20:55 +01:00
Bartosz Taudul
b1c88cd1f2
Cache ThreadData pointer for current thread context.
2019-11-10 17:17:07 +01:00
Bartosz Taudul
672093cf0e
Adapt WriteTimeline() to magic vectors.
2019-11-10 16:34:38 +01:00
Bartosz Taudul
9b52152e77
Adapt GetZoneEnd() for magic vectors.
2019-11-10 01:43:28 +01:00
Bartosz Taudul
7c277234e7
Load GPU zones into magic vectors.
2019-11-10 01:36:13 +01:00
Bartosz Taudul
f8edd3a37b
Zone statistics reconstructions has to use magic vectors.
2019-11-10 00:00:40 +01:00
Bartosz Taudul
065ba4ce5a
Load zones into magic vectors.
2019-11-10 00:00:40 +01:00
Bartosz Taudul
40e9c8807d
Remove unused lambda capture.
2019-11-10 00:00:15 +01:00
Bartosz Taudul
3a317c81c6
Fix logic error.
2019-11-09 23:57:08 +01:00
Bartosz Taudul
b3698ebb0f
Merge read calls.
2019-11-09 00:48:20 +01:00
Bartosz Taudul
e80a19234e
Don't store and read compressed thread.
2019-11-09 00:23:09 +01:00
Bartosz Taudul
467d675262
Zone reads can be merged.
2019-11-09 00:08:26 +01:00
Bartosz Taudul
23c59a6fc9
Use query cache.
2019-11-08 23:59:20 +01:00
Bartosz Taudul
ec895372b7
Thread is not needed in ReadTimeline().
2019-11-08 23:56:11 +01:00
Bartosz Taudul
6ec734c264
Split ReadTimelineUpdateStatistics().
2019-11-08 23:53:43 +01:00
Bartosz Taudul
bb2d44ae08
All time deltas must be processed.
2019-11-07 16:14:23 +01:00
Bartosz Taudul
ea2c329510
Input data *must not* be changed.
...
Not even for a short moment.
2019-11-07 01:29:11 +01:00
Bartosz Taudul
4a4fe82a1b
No need to inject string terminator.
...
Comparison in m_data.stringMap already takes string size into account,
as an charutil::StringKey optimization.
2019-11-07 01:28:29 +01:00
Bartosz Taudul
dfad9695d2
Compress frame image data right as it arrives.
...
This removes the need to store temporary uncompressed image buffers,
which involves constant memory allocation and freeing. Instead, just one
permanent buffer is used, and only because the input data cannot change
during processing.
2019-11-06 23:29:59 +01:00
Bartosz Taudul
46d33f45bf
Frame image packer doesn't care about width and height.
2019-11-06 22:53:01 +01:00
Bartosz Taudul
10a3516099
Delete uncompressed frame image data.
2019-11-06 22:38:19 +01:00
Bartosz Taudul
df0e28a61f
Remove more unneeded includes.
2019-11-06 01:37:58 +01:00
Bartosz Taudul
f53637891a
Remove LZ4 include from TracyWorker.hpp.
2019-11-06 01:25:38 +01:00
Bartosz Taudul
661c4a417b
Process and store plot value formatting.
2019-11-05 18:02:08 +01:00
Bartosz Taudul
a7a739eea9
Use precalculated context switch usage data.
2019-11-05 01:41:27 +01:00
Bartosz Taudul
51090e5fb9
Implement ctx switch usage reconstruction.
2019-11-05 01:28:44 +01:00
Bartosz Taudul
50b96c757e
Context switch usage reconstruction skeleton.
2019-11-05 01:28:44 +01:00
Bartosz Taudul
d9c3238462
Save 2 bytes per PlotItem.
...
Memory savings:
android 2614 MB -> 2487 MB (95%)
chicken 1932 MB -> 1852 MB (95%)
mem 6067 MB -> 5747 MB (94%)
q3bsp-mt 5059 MB -> 5017 MB (99%)
q3bsp-st 1211 MB -> 1171 MB (96%)
2019-11-03 16:29:45 +01:00
Bartosz Taudul
4bc1588a5e
Clear proper vector.
2019-11-02 16:57:18 +01:00
Bartosz Taudul
0df29d1e0b
Use short ptr for source location payload data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
72efbe28ed
Use short ptr for message data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
1e4022e05b
Use proper comparison.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
a40bbacb17
Use short ptr for CPU zone data.
2019-11-02 16:54:12 +01:00
Bartosz Taudul
cb20bf01f9
Use short ptr for GPU zone data.
2019-11-02 16:54:11 +01:00
Bartosz Taudul
c7664b0a98
Use short ptr in LockEventPtr.
2019-11-02 16:17:45 +01:00
Bartosz Taudul
45ff14d678
Fix saving source location payload data.
2019-11-02 14:28:59 +01:00
Bartosz Taudul
16bc862904
Save sizes of children vectors to prevent reallocation.
2019-11-02 12:38:32 +01:00
Bartosz Taudul
13b656fe61
Make srcloc dynamic color depend on function name.
2019-11-01 20:17:25 +01:00
Bartosz Taudul
39988ad636
Check for shutdown in background processing thread.
2019-10-31 21:41:21 +01:00
Bartosz Taudul
456deefdbc
Keep child idx on stack.
2019-10-30 23:55:21 +01:00
Bartosz Taudul
25b610a36f
Pack child into GPU start/end in GpuEvent (saves 4 bytes).
...
long 5152 MB -> 5061 MB
2019-10-30 23:50:37 +01:00
Bartosz Taudul
e8286600d1
Use -1 as invalid GPU start time.
2019-10-30 23:12:43 +01:00
Bartosz Taudul
7ce8c772ad
Disallow negative GPU times.
...
Shouldn't happen, but GPU timestamps are a shitshow, so better be safe
than sorry.
2019-10-30 22:37:07 +01:00
Bartosz Taudul
ae4794ab4c
Save 2 bytes in ContextSwitchData and ContextSwitchCpu.
2019-10-30 22:25:46 +01:00
Bartosz Taudul
99d198d0bf
Pack csAlloc in MemEvent (saves 3 bytes).
...
Memory usage change on selected traces:
android 2699 MB -> 2613 MB
chicken 2019 MB -> 2007 MB
mem 6308 MB -> 6068 MB
q3bsp-mt 5283 MB -> 5252 MB
q3bsp-st 1241 MB -> 1211 MB
2019-10-30 22:01:13 +01:00
Bartosz Taudul
6f0dc2885f
Fix connection abort.
2019-10-28 23:32:51 +01:00
Bartosz Taudul
8050622b0f
Read and decompress network data on a separate thread.
2019-10-28 23:22:50 +01:00
Bartosz Taudul
e0356ae01e
Cosmetics.
2019-10-28 22:53:06 +01:00
Bartosz Taudul
99b7e8ad92
Close socket when shutting down.
2019-10-28 22:52:52 +01:00
Bartosz Taudul
788ca2e5df
Spawn no-op network thread.
2019-10-28 22:45:10 +01:00
Bartosz Taudul
7f07f5beb4
Free child time stack.
2019-10-26 23:32:16 +02:00
Bartosz Taudul
01985f50ef
Cache source location zones counter search.
2019-10-26 16:33:40 +02:00
Bartosz Taudul
1d0084aa28
Add cache for last accessed source location zones.
2019-10-25 21:29:55 +02:00
Bartosz Taudul
b5419944aa
Only write to memory if value has changed.
2019-10-25 21:28:55 +02:00
Bartosz Taudul
779063a18b
Cache last shrinked source location.
2019-10-25 21:07:28 +02:00
Bartosz Taudul
294793367f
Cache last CheckSourceLocation query.
...
Just knowing that the query was performed is enough here -- this
function adds a new source location entry, if there already isn't one.
2019-10-25 21:01:33 +02:00
Bartosz Taudul
0f2503d334
Send time deltas in GPU time events.
2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176
Send delta times for context switches.
2019-10-25 19:13:11 +02:00
Bartosz Taudul
29c42cc8d7
Fix assert.
2019-10-25 01:00:32 +02:00
Bartosz Taudul
17a51c898e
No need to check if vector is empty.
2019-10-25 00:54:46 +02:00
Bartosz Taudul
b5e759bc5a
Don't calculate child index twice.
2019-10-25 00:54:46 +02:00
Bartosz Taudul
70f1074490
Don't iterate over children to calculate zone self time.
2019-10-25 00:33:44 +02:00
Bartosz Taudul
1fe76be955
Don't reconstruct lock event time on insert.
2019-10-24 23:25:04 +02:00
Bartosz Taudul
b83d0f46d9
Improve updating last time.
...
Avoid LHS, don't write if don't need to.
2019-10-24 23:23:52 +02:00
Bartosz Taudul
721f3c8925
Callstack is already zero-initialized.
2019-10-24 23:05:39 +02:00
Bartosz Taudul
c9da5f1474
Use cached thread retriever.
2019-10-24 22:34:18 +02:00
Bartosz Taudul
5873561b54
Add cached thread retriever.
2019-10-24 22:33:48 +02:00
Bartosz Taudul
06bc802107
Avoid load-hit-store.
2019-10-24 22:24:00 +02:00
Bartosz Taudul
1cfb5adc44
Count transferred data size.
2019-10-24 00:47:16 +02:00
Bartosz Taudul
ba61a9ed84
Transfer time deltas, not absolute times.
...
This change significantly reduces network bandwidth requirements.
Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
5c92eae3ed
Add early exit for invalid times.
2019-10-20 18:47:50 +02:00
Bartosz Taudul
4d761def61
Microoptimize comparison.
2019-10-16 20:26:39 +02:00
Bartosz Taudul
3ae5c125f6
Implement counting CPU usage (ctx switch) at a given time.
2019-10-15 16:54:43 +02:00
Bartosz Taudul
3ce6b1205f
Don't iterate over 256 CPUs.
2019-10-15 16:13:53 +02:00
Bartosz Taudul
eccb0b1e4a
Track max CPU present in context switch data.
2019-10-15 16:13:53 +02:00
Bartosz Taudul
bdb8516d04
Make sure context switch end time wasn't set already.
2019-10-15 14:54:28 +02:00
Bartosz Taudul
215dc8a804
More compact GpuEvent struct (save 4 bytes).
...
Memory usage reduction of various traces:
big 9011 -> 9007
frameimages 561 -> 552
fi-big 4144 -> 4139
long 5253 -> 5125
2019-10-13 14:42:52 +02:00
Bartosz Taudul
5e1894dd79
Count GPU zones.
2019-10-13 14:13:04 +02:00
Bartosz Taudul
65ea33a60f
Store memory callstack data as 24-bit ints.
...
This reduces MemEvent size from 40 to 38 bytes.
Memory usage reduction:
chicken 2027 -> 2019
mem 6468 -> 6308
q3bsp-mt 5304 -> 5283
2019-10-01 22:38:17 +02:00
Bartosz Taudul
f0b957ec56
Store callstacks on 24 bits.
...
ZoneEvent is now 27 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9224 -> 9011 (97%)
chicken 2044 -> 2027 (99%)
drl-l-b 1443 -> 1383 (95%)
long 5327 -> 5253 (98%)
q3bsp-mt 5400 -> 5304 (98%)
selfprofile 1403 -> 1382 (98%)
2019-10-01 22:38:17 +02:00
Bartosz Taudul
717a212563
Save another 2 bytes per ZoneEvent.
...
ZoneEvent is not 28 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9527 -> 9224 (96%)
chicken 2107 -> 2044 (97%)
drl-l-b 1479 -> 1443 (97%)
long 5412 -> 5327 (98%)
q3bsp-mt 5592 -> 5400 (96%)
selfprofile 1443 -> 1403 (97%)
2019-10-01 01:05:37 +02:00
Bartosz Taudul
2470936050
Don't perform background tasks during trace upgrade.
2019-09-29 20:52:25 +02:00
Bartosz Taudul
d228bcb622
Pack StringIdx in 24 bits.
...
This reduces ZoneEvent size from 32 to 30 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9902 -> 9527 (96%)
chicken 2172 -> 2107 (97%)
ctx-big 311 -> 309 (99%)
drl-l-b 1570 -> 1479 (94%)
long 5496 -> 5412 (98%)
mem 6468 -> 6468 (100%)
q3bsp-mt 5784 -> 5592 (96%)
selfprofile 1486 -> 1443 (97%)
2019-09-29 20:32:42 +02:00
Aleksei Skriabin
c0c2f4536a
strstr_nocase() typo fix.
2019-09-28 14:20:29 +05:00
Bartosz Taudul
a5ba74ed13
Handle multiple Vulkan threads.
2019-09-23 17:27:49 +02:00
Bartosz Taudul
82cd667b30
Allow specifying network port in server.
2019-09-21 15:43:01 +02:00
Bartosz Taudul
d8e0853cd8
Multithreaded frame image compression.
2019-09-20 23:03:12 +02:00
Bartosz Taudul
e1e5d6bd47
Add const version of PackFrameImage().
...
Temporary buffer needs to be handled outside of the function.
2019-09-20 22:55:55 +02:00
Bartosz Taudul
8fe9b56b6f
Calculate frame statistics.
2019-09-16 22:02:47 +02:00
Bartosz Taudul
7673028dba
Fix skipping memory data.
2019-09-16 15:42:25 +02:00
Bartosz Taudul
cdc4575dba
Setup tid -> thread data mapping when loading trace.
2019-09-08 14:15:40 +02:00
Bartosz Taudul
ea6a0a58a7
Thread data accessor.
2019-09-08 14:07:16 +02:00
Bartosz Taudul
aac0a36a2d
Don't use source location zones before they are ready.
2019-09-07 17:23:11 +02:00
Bartosz Taudul
86cb477811
Pack ZoneThreadData.
...
This reduces struct size from 10 to 8 bytes. Assumes 48-bit pointers
(4-level paging)!
Memory savings (MB):
android 2766 -> 2757 (99%)
big 10.29 G -> 9902 (96%)
chicken 2244 -> 2172 (96%)
ctx-android 228 -> 224 (98%)
drl-l-b 1635 -> 1570 (96%)
gn-vulkan 244 -> 240 (98%)
long 5656 -> 5496 (97%)
q3bsp-mt 6043 -> 5784 (95%)
selfprofile 1554 -> 1486 (95%)
2019-08-31 00:55:51 +02:00