Bartosz Taudul
b5419944aa
Only write to memory if value has changed.
2019-10-25 21:28:55 +02:00
Bartosz Taudul
779063a18b
Cache last shrinked source location.
2019-10-25 21:07:28 +02:00
Bartosz Taudul
294793367f
Cache last CheckSourceLocation query.
...
Just knowing that the query was performed is enough here -- this
function adds a new source location entry, if there already isn't one.
2019-10-25 21:01:33 +02:00
Bartosz Taudul
0f2503d334
Send time deltas in GPU time events.
2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176
Send delta times for context switches.
2019-10-25 19:13:11 +02:00
Bartosz Taudul
29c42cc8d7
Fix assert.
2019-10-25 01:00:32 +02:00
Bartosz Taudul
17a51c898e
No need to check if vector is empty.
2019-10-25 00:54:46 +02:00
Bartosz Taudul
b5e759bc5a
Don't calculate child index twice.
2019-10-25 00:54:46 +02:00
Bartosz Taudul
70f1074490
Don't iterate over children to calculate zone self time.
2019-10-25 00:33:44 +02:00
Bartosz Taudul
1fe76be955
Don't reconstruct lock event time on insert.
2019-10-24 23:25:04 +02:00
Bartosz Taudul
b83d0f46d9
Improve updating last time.
...
Avoid LHS, don't write if don't need to.
2019-10-24 23:23:52 +02:00
Bartosz Taudul
721f3c8925
Callstack is already zero-initialized.
2019-10-24 23:05:39 +02:00
Bartosz Taudul
c9da5f1474
Use cached thread retriever.
2019-10-24 22:34:18 +02:00
Bartosz Taudul
5873561b54
Add cached thread retriever.
2019-10-24 22:33:48 +02:00
Bartosz Taudul
06bc802107
Avoid load-hit-store.
2019-10-24 22:24:00 +02:00
Bartosz Taudul
1cfb5adc44
Count transferred data size.
2019-10-24 00:47:16 +02:00
Bartosz Taudul
ba61a9ed84
Transfer time deltas, not absolute times.
...
This change significantly reduces network bandwidth requirements.
Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
5c92eae3ed
Add early exit for invalid times.
2019-10-20 18:47:50 +02:00
Bartosz Taudul
4d761def61
Microoptimize comparison.
2019-10-16 20:26:39 +02:00
Bartosz Taudul
3ae5c125f6
Implement counting CPU usage (ctx switch) at a given time.
2019-10-15 16:54:43 +02:00
Bartosz Taudul
3ce6b1205f
Don't iterate over 256 CPUs.
2019-10-15 16:13:53 +02:00
Bartosz Taudul
eccb0b1e4a
Track max CPU present in context switch data.
2019-10-15 16:13:53 +02:00
Bartosz Taudul
bdb8516d04
Make sure context switch end time wasn't set already.
2019-10-15 14:54:28 +02:00
Bartosz Taudul
215dc8a804
More compact GpuEvent struct (save 4 bytes).
...
Memory usage reduction of various traces:
big 9011 -> 9007
frameimages 561 -> 552
fi-big 4144 -> 4139
long 5253 -> 5125
2019-10-13 14:42:52 +02:00
Bartosz Taudul
5e1894dd79
Count GPU zones.
2019-10-13 14:13:04 +02:00
Bartosz Taudul
65ea33a60f
Store memory callstack data as 24-bit ints.
...
This reduces MemEvent size from 40 to 38 bytes.
Memory usage reduction:
chicken 2027 -> 2019
mem 6468 -> 6308
q3bsp-mt 5304 -> 5283
2019-10-01 22:38:17 +02:00
Bartosz Taudul
f0b957ec56
Store callstacks on 24 bits.
...
ZoneEvent is now 27 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9224 -> 9011 (97%)
chicken 2044 -> 2027 (99%)
drl-l-b 1443 -> 1383 (95%)
long 5327 -> 5253 (98%)
q3bsp-mt 5400 -> 5304 (98%)
selfprofile 1403 -> 1382 (98%)
2019-10-01 22:38:17 +02:00
Bartosz Taudul
717a212563
Save another 2 bytes per ZoneEvent.
...
ZoneEvent is not 28 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9527 -> 9224 (96%)
chicken 2107 -> 2044 (97%)
drl-l-b 1479 -> 1443 (97%)
long 5412 -> 5327 (98%)
q3bsp-mt 5592 -> 5400 (96%)
selfprofile 1443 -> 1403 (97%)
2019-10-01 01:05:37 +02:00
Bartosz Taudul
2470936050
Don't perform background tasks during trace upgrade.
2019-09-29 20:52:25 +02:00
Bartosz Taudul
d228bcb622
Pack StringIdx in 24 bits.
...
This reduces ZoneEvent size from 32 to 30 bytes.
Memory usage reduction on selected traces (sizes in MB):
big 9902 -> 9527 (96%)
chicken 2172 -> 2107 (97%)
ctx-big 311 -> 309 (99%)
drl-l-b 1570 -> 1479 (94%)
long 5496 -> 5412 (98%)
mem 6468 -> 6468 (100%)
q3bsp-mt 5784 -> 5592 (96%)
selfprofile 1486 -> 1443 (97%)
2019-09-29 20:32:42 +02:00
Aleksei Skriabin
c0c2f4536a
strstr_nocase() typo fix.
2019-09-28 14:20:29 +05:00
Bartosz Taudul
a5ba74ed13
Handle multiple Vulkan threads.
2019-09-23 17:27:49 +02:00
Bartosz Taudul
82cd667b30
Allow specifying network port in server.
2019-09-21 15:43:01 +02:00
Bartosz Taudul
d8e0853cd8
Multithreaded frame image compression.
2019-09-20 23:03:12 +02:00
Bartosz Taudul
e1e5d6bd47
Add const version of PackFrameImage().
...
Temporary buffer needs to be handled outside of the function.
2019-09-20 22:55:55 +02:00
Bartosz Taudul
8fe9b56b6f
Calculate frame statistics.
2019-09-16 22:02:47 +02:00
Bartosz Taudul
7673028dba
Fix skipping memory data.
2019-09-16 15:42:25 +02:00
Bartosz Taudul
cdc4575dba
Setup tid -> thread data mapping when loading trace.
2019-09-08 14:15:40 +02:00
Bartosz Taudul
ea6a0a58a7
Thread data accessor.
2019-09-08 14:07:16 +02:00
Bartosz Taudul
aac0a36a2d
Don't use source location zones before they are ready.
2019-09-07 17:23:11 +02:00
Bartosz Taudul
86cb477811
Pack ZoneThreadData.
...
This reduces struct size from 10 to 8 bytes. Assumes 48-bit pointers
(4-level paging)!
Memory savings (MB):
android 2766 -> 2757 (99%)
big 10.29 G -> 9902 (96%)
chicken 2244 -> 2172 (96%)
ctx-android 228 -> 224 (98%)
drl-l-b 1635 -> 1570 (96%)
gn-vulkan 244 -> 240 (98%)
long 5656 -> 5496 (97%)
q3bsp-mt 6043 -> 5784 (95%)
selfprofile 1554 -> 1486 (95%)
2019-08-31 00:55:51 +02:00
Bartosz Taudul
3ec534cdf3
Prevent "ntdll.dll" from appearing as a thread name.
2019-08-30 23:09:07 +02:00
Bartosz Taudul
1c0c6311ec
Fix skipping data when loading traces.
2019-08-30 01:16:42 +02:00
Bartosz Taudul
a2f968d843
Compress thread id in MessageData.
2019-08-28 21:03:01 +02:00
Bartosz Taudul
fd5014be6f
GetThreadString() is no longer used.
2019-08-28 20:08:16 +02:00
Bartosz Taudul
3c092b4bec
Add thread name getter combining local and external thread names.
2019-08-27 23:00:13 +02:00
Bartosz Taudul
f76f38777e
Signed minus unsigned is unsigned...
2019-08-26 19:09:12 +02:00
Bartosz Taudul
1712431dfd
Compress external threads. Saves 4 bytes per ctx switch.
...
Dropped support for loading context switch data in previous versions of
traces.
2019-08-19 23:09:58 +02:00
Bartosz Taudul
21e7a4bb16
Extract thread compression into a separate class.
2019-08-19 22:56:58 +02:00
Bartosz Taudul
94382f54ca
Move FileVersion() to TracyFileHeader.hpp.
2019-08-19 22:56:58 +02:00
Bartosz Taudul
19857473e3
Also collect information on local threads.
2019-08-18 14:56:17 +02:00
Bartosz Taudul
3b8518f7b6
Save/load CPU thread data.
2019-08-18 01:53:38 +02:00
Bartosz Taudul
62dbe522c5
Add accessors.
2019-08-18 01:51:02 +02:00
Bartosz Taudul
103645c2fa
Calculate cpu thread data statistics.
2019-08-18 01:50:49 +02:00
Bartosz Taudul
1498417a8d
Save/load tid to pid mapping.
2019-08-17 22:36:21 +02:00
Bartosz Taudul
20e8a5ecc8
Create tid to pid mapping.
2019-08-17 22:32:41 +02:00
Bartosz Taudul
678e942e9f
Transfer PID of profiled program.
2019-08-17 22:19:04 +02:00
Bartosz Taudul
414f903cc5
Collect thread wakeup data.
2019-08-17 17:05:29 +02:00
Bartosz Taudul
26be78530f
Use signed number to calculate frame offset.
2019-08-17 15:22:54 +02:00
Bartosz Taudul
6c53cac15e
Fix uninitialized variable.
2019-08-16 21:20:04 +02:00
Bartosz Taudul
e975c4d7bf
Also retrieve external thread names.
2019-08-16 19:49:16 +02:00
Bartosz Taudul
ccaf92afc4
Save/load external process names.
2019-08-16 19:24:38 +02:00
Bartosz Taudul
fe7f56b022
Implement retrieval of external process names.
2019-08-16 19:22:23 +02:00
Bartosz Taudul
c212661714
Allow determining whether thread is local to profiled program.
2019-08-16 17:59:25 +02:00
Bartosz Taudul
cef7e4b8d0
Save/load per-cpu context switches.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
8bc4258e29
Display count of per-cpu context switch data.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
69527d2f71
Collect per-cpu context switch data.
2019-08-16 16:51:18 +02:00
Bartosz Taudul
42c71d7e46
Fix loading old traces.
2019-08-16 00:24:29 +02:00
Bartosz Taudul
889eddd646
Pack ContextSwitchData. Saves 3 bytes per context switch region.
2019-08-15 23:53:47 +02:00
Bartosz Taudul
c22c259a13
Pack time and thread in MemEvent.
...
This saves 4 bytes per logged memory allocation. Memory savings for
selected traces:
android 2945 MB -> 2766 MB
chicken 2261 MB -> 2245 MB
q3bsp-mt 6085 MB -> 6043 MB
mem 6788 MB -> 6468 MB
2019-08-15 23:02:43 +02:00
Bartosz Taudul
9618ee3581
Fix skipping locks.
2019-08-15 22:24:27 +02:00
Bartosz Taudul
8b73dece98
Preserve magic time values when loading old traces.
2019-08-15 21:30:37 +02:00
Bartosz Taudul
3db3952135
Hackfix for broken lock terminate times.
2019-08-15 20:45:00 +02:00
Bartosz Taudul
5e20b3f28a
Pack time and source location in LockEvent.
2019-08-15 20:39:16 +02:00
Bartosz Taudul
bf3ad57456
Pack start time and srcloc together in ZoneEvent.
...
This reduces ZoneEvent struct size by 2 bytes. Memory savings on various
captures:
10.62 GB -> 10.29 GB
2342 MB -> 2276 MB
1706 MB -> 1635 MB
6277 MB -> 6085 MB
2019-08-15 20:17:36 +02:00
Bartosz Taudul
042e6c9e11
Set initial time of old traces to 0.
2019-08-15 20:17:36 +02:00
Bartosz Taudul
b322d20c19
Store received timestamps offset to 0.
2019-08-15 20:17:36 +02:00
Bartosz Taudul
659907c972
Store srcloc identifiers using 16 bit.
...
This reduces various structure sizes by 2 bytes. Memory usage reduction
on various traces:
big 11 GB -> 10.62 GB
chicken 2436 MB -> 2342 MB
drl-light-big 1761 MB -> 1706 MB
q3bsp-mt 6469 MB -> 6277 MB
2019-08-15 20:15:48 +02:00
Bartosz Taudul
416113fdcb
Drop support for ETC1 frame images.
2019-08-15 16:29:50 +02:00
Bartosz Taudul
9a364fe5fe
Cache context switch data queries.
2019-08-14 20:16:11 +02:00
Bartosz Taudul
3e01ca3269
Calculate how long thread was in running time.
2019-08-14 17:12:48 +02:00
Bartosz Taudul
0bb0c10e3c
Revert "Save one byte on ContextSwitchData."
...
Counting bits is hard, let's go shopping.
2019-08-14 13:55:05 +02:00
Bartosz Taudul
3996516fce
One more SetThreadName() to change.
2019-08-14 02:27:01 +02:00
Bartosz Taudul
f285e0f5cc
Save one byte on ContextSwitchData.
2019-08-13 15:16:46 +02:00
Bartosz Taudul
9417ad994d
Save/load context switch data.
2019-08-13 13:10:37 +02:00
Bartosz Taudul
1c937ad9bb
Implement skipping frame image data.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
8c494eabbf
Display number of context switch regions.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
0b03fed61c
Add context switch accessor.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
419f74280d
Store context switches.
2019-08-13 02:35:32 +02:00
Bartosz Taudul
8aa0be39d5
Drop support for CPU id queries.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
d6f32a0839
Serialize lock processing.
...
This makes is much easier to process on the server and opens new
optimization possibilities. It also fixes theoretical problems, which
may be caused by invalid ordering of events with the same timestamp.
2019-08-12 13:51:01 +02:00
Bartosz Taudul
6398ecb344
Drop support for pre-0.4 traces.
2019-08-12 12:36:37 +02:00
Bartosz Taudul
a76622d17a
Cache last searched ThreadData.
2019-08-03 14:35:01 +02:00
Bartosz Taudul
12969ee497
Track thread context.
...
This change exploits the fact that events are processed in batches
originating from a single thread. A single message changing thread
context is enough to handle multiple messages, as opposed to inclusion
of thread identifier in each message.
2019-08-02 20:18:08 +02:00
Bartosz Taudul
a4e7a341c0
Proper handling of disconnect request.
2019-08-01 23:14:09 +02:00
Bartosz Taudul
dc49f2f76a
Move DXT1 index conversion to server.
2019-07-19 21:46:58 +02:00
Bartosz Taudul
2e774f4626
Save/load application info.
2019-07-12 18:45:35 +02:00
Bartosz Taudul
d64ab7db5a
Store app info messages.
2019-07-12 18:34:46 +02:00
Bartosz Taudul
10bcc8c770
Switch to DXT1 textures in profiler utility.
2019-06-27 19:14:51 +02:00
Bartosz Taudul
7dc7ece2bd
Add staging area for frame images.
...
Compressing frame images on a separate thread may cause frame image
arrival before frames are sent. Fix this issue by creating a staging
area in which frame images will wait for frames to arrive.
This probably breaks playback functionality, as non-existent frames may
be queried, but this problem seems to be very hard to find, so let's
ignore it for now.
2019-06-27 13:24:35 +02:00
Bartosz Taudul
bb35f9a897
Compress frame images in a separate thread.
2019-06-27 13:24:35 +02:00
Bartosz Taudul
1c41229766
Use proper type for buffer size comparison.
2019-06-22 14:07:53 +02:00
Bartosz Taudul
6a82f666a7
Cosmetics.
2019-06-22 14:05:18 +02:00
Bartosz Taudul
eb4c7ca9ea
Ignore useless warnings.
2019-06-22 13:40:00 +02:00
Bartosz Taudul
850815534e
Insert frame mark at beginning of on-demand connection.
2019-06-21 19:39:41 +02:00
Bartosz Taudul
31a4a45b14
Ignore memory free faults if running on apple.
...
There's a case in MoltenVK initialization where overloading operator new
and operator delete works for std::string destruction, but not
construction.
2019-06-13 14:15:17 +02:00
Bartosz Taudul
37d1457b44
Frame image may need flipping.
2019-06-12 15:28:32 +02:00
Bartosz Taudul
eb6ac5e6e1
Store frame reference in frame images.
2019-06-12 00:55:02 +02:00
Bartosz Taudul
5f8eadfb16
Release zone id stack.
2019-06-09 17:56:41 +02:00
Bartosz Taudul
b1f8d9fba1
Send server termination query on server disconnect.
2019-06-09 16:10:49 +02:00
Bartosz Taudul
2c780f1af4
Allow sending immediate termination query from server.
2019-06-09 16:10:49 +02:00
Bartosz Taudul
d6d7b82529
Ignore invalid frame images in on-demand mode.
2019-06-09 15:37:49 +02:00
Bartosz Taudul
50cda7720f
Handle frame image instrumentation failures.
2019-06-09 13:44:53 +02:00
Bartosz Taudul
bef1988800
Compress frame images using LZ4.
2019-06-08 12:17:18 +02:00
Bartosz Taudul
fc5a8f7e3a
Assign frame image to the correct frame (including offset).
2019-06-07 20:13:08 +02:00
Bartosz Taudul
42a30bffe1
Frame images are now ETC1 compressed.
2019-06-07 00:31:51 +02:00
Bartosz Taudul
646e7327b8
Show loading progress of frame images.
2019-06-06 23:40:37 +02:00
Bartosz Taudul
6b2741ccdb
Save/load frame images.
2019-06-06 23:08:19 +02:00
Bartosz Taudul
cd2f572a2f
Use proper index.
2019-06-06 22:22:57 +02:00
Bartosz Taudul
af56f41e32
Add frame image accessor.
2019-06-06 22:14:51 +02:00
Bartosz Taudul
34b84bb284
Add frame image index to frame data.
2019-06-06 21:44:48 +02:00
Bartosz Taudul
e5bb6011c5
Frame image transfer prototype.
2019-06-06 21:39:54 +02:00
Bartosz Taudul
5681096486
Track status of worker background tasks.
2019-06-02 15:00:38 +02:00
Bartosz Taudul
845f3a2ddf
Use std::shared_mutex for locking worker access.
2019-05-28 19:21:53 +02:00
Bartosz Taudul
0da1e8551f
Track lock contention status.
2019-05-12 16:17:17 +02:00
Bartosz Taudul
a714cd4369
Typo.
2019-05-12 15:59:53 +02:00
Bartosz Taudul
74575250a5
Save message color data in trace dumps.
2019-05-10 20:32:47 +02:00
Bartosz Taudul
4850e19ebd
Store color in message data.
2019-05-10 20:26:27 +02:00
Bartosz Taudul
797ebd3caf
Cosmetics.
2019-05-10 20:20:08 +02:00
Bartosz Taudul
efc54babe3
Transfer of colored messages.
2019-05-10 20:17:44 +02:00
Bartosz Taudul
20e6813461
Store send queue size in mbps block.
2019-04-01 19:55:37 +02:00
Bartosz Taudul
9010b2c142
Put queries into queue if send buffer is full.
2019-04-01 19:47:29 +02:00
Bartosz Taudul
deeea0ee70
Track space left in send buffer.
2019-04-01 19:37:39 +02:00
Bartosz Taudul
57dff0abc9
Add server query queue.
2019-04-01 19:26:50 +02:00
Bartosz Taudul
c07c6d11b7
Define server query packet.
2019-04-01 19:21:53 +02:00
Bartosz Taudul
fef417f286
Store total number of CPU and GPU zones in trace.
2019-03-27 01:46:54 +01:00
Bartosz Taudul
2e6ac050f4
Use custom vector swap.
2019-03-26 23:02:39 +01:00
Bartosz Taudul
a632d9e2a3
Add zone vector cache.
...
Zone children will be now collected in staging vectors. When the zone is
ended (and no children can be added anymore to it), a size-fitted vector
is allocated using slab allocation. The over-allocated vector is then
put into cache for use in future zones.
This is only active for vectors <= 8192 elements, or 64 KB (chosen
arbitrarily), to reduce time spent on copying memory.
Overall, this change should have the following effects:
- System memory allocation pressure reduction, due to re-usage of
vectors, which eliminates the need for constant growth.
- Reduction of memory usage, because children vectors are now fitted to
required size.
- Slight increase of zone processing time, due to memory copying?
2019-03-26 22:06:00 +01:00
Bartosz Taudul
11f4dcbf1e
Consistent variable naming.
2019-03-26 21:41:44 +01:00
Bartosz Taudul
99fca9e069
Fix loading old traces when skipping locks.
2019-03-26 20:25:29 +01:00
Bartosz Taudul
e79fa04a8b
Don't fail when timer accuracy is low.
2019-03-21 21:24:07 +01:00
Bartosz Taudul
7e6a8135df
Remove double indirection in GetNextLockEvent().
2019-03-16 14:18:43 +01:00
Bartosz Taudul
67f14be6aa
Update lock ranges when loading trace.
2019-03-16 02:50:51 +01:00
Bartosz Taudul
8ced8a457c
Update thread time range on lock event insert.
2019-03-16 02:50:51 +01:00
Bartosz Taudul
dc981550a1
Load lock event time to a variable.
2019-03-16 02:50:51 +01:00
Bartosz Taudul
71e20e7e7f
Store lock map as flat_hash_map with pointer values.
2019-03-16 02:50:51 +01:00
Bartosz Taudul
5fbc14c487
Fix skipping plots in version >= 0.4.5.
2019-03-15 15:27:37 +01:00
Bartosz Taudul
a0299cc63a
Optimize calculation of standard deviation.
2019-03-14 01:23:37 +01:00
Bartosz Taudul
d64f07f853
Don't search for thread for empty timelines.
2019-03-14 01:10:57 +01:00
Bartosz Taudul
b7fe29f750
Offload timeline statistics update to a background thread.
2019-03-13 01:46:05 +01:00