Commit Graph

822 Commits

Author SHA1 Message Date
Bartosz Taudul
97880a89ae Clobber ecx register. 2017-10-29 16:20:07 +01:00
Bartosz Taudul
a220043114 Add no-cpu GetTime() variant.
In this version the address of cpu output variable is const, so there's
no stack address calculation involved.
2017-10-29 16:12:16 +01:00
Bartosz Taudul
68f5a17bca Use 32-bit registers for rdtscp output. 2017-10-29 13:15:43 +01:00
Bartosz Taudul
03289175ab Lock counter also must be initialized early. 2017-10-24 22:02:49 +02:00
Bartosz Taudul
ad338a7cfd Fix message literals. 2017-10-21 12:39:26 +02:00
Bartosz Taudul
f1da7c1c85 Force TLS block creation on cygwin before malloc. 2017-10-20 18:28:25 +02:00
Bartosz Taudul
1e645665fe Initialize rpmalloc in profiler worker thread.
Thread local variables on gcc are apparently not initialized on thread
startup, but on first access to thread local variables block. Previously
it was working, because s_token was accessed before any rpmalloc
allocation could be performed. Now the first rpmalloc allocation is the
Socket class, and rpmalloc is not initialized there, as there was no
thread local access yet.
2017-10-18 23:30:54 +02:00
Bartosz Taudul
9c4316879c Add TRACY_NO_EXIT macro. 2017-10-18 20:01:12 +02:00
Bartosz Taudul
51013dc0e6 Manual allocation of socket memory. 2017-10-18 19:50:28 +02:00
Bartosz Taudul
fc94378e0c Move TracyAlloc.hpp to common. Use rpmalloc only if TRACY_ENABLE. 2017-10-18 19:50:28 +02:00
Bartosz Taudul
c5ea9c744c Do not disable lz4 in debug builds. 2017-10-18 19:50:28 +02:00
Bartosz Taudul
6a2cbe2842 Rename DISABLE_LZ4 to TRACY_DISABLE_LZ4. 2017-10-18 19:50:22 +02:00
Bartosz Taudul
7c47edc64f Terminate connection handshake. 2017-10-18 18:48:51 +02:00
Bartosz Taudul
d942b7edf1 Don't exit until all data is sent. 2017-10-17 22:02:47 +02:00
Bartosz Taudul
652dccd163 Also no need to construct more than one welcome message. 2017-10-17 21:55:40 +02:00
Bartosz Taudul
5421164f33 No need to get process name more than once. 2017-10-17 21:53:09 +02:00
Bartosz Taudul
1e3476cf36 Transfer profiler initialization time. 2017-10-17 01:10:38 +02:00
Bartosz Taudul
51f5ae4796 More precise profiler init end time measurement. 2017-10-17 01:07:54 +02:00
Bartosz Taudul
0ed789825a Measure time of initialization start. 2017-10-17 01:07:34 +02:00
Bartosz Taudul
866081bf29 Initialize tracy before anything else. 2017-10-17 00:36:15 +02:00
Bartosz Taudul
9d01b508ed One more type cast. 2017-10-17 00:25:32 +02:00
Bartosz Taudul
8a6e4d2971 Change TRACY_DISABLE to TRACY_ENABLE.
By default tracy is now disabled.
2017-10-16 21:34:39 +02:00
Bartosz Taudul
518568a513 Move client/Tracy.hpp -> Tracy.hpp. 2017-10-16 21:28:38 +02:00
Bartosz Taudul
e04bd05606 Always use ShouldExit() to determine if worker should exit. 2017-10-16 21:21:42 +02:00
Bartosz Taudul
9f28205548 Use custom threading wrapper instead of std::thread.
std::thread may perform memory allocation when a thread is created (it
does so on MSVC). Tracy heap is managed by its own allocator and this
changes prevents accessing application heap.
2017-10-16 21:17:58 +02:00
Bartosz Taudul
2f8d3ff5eb Add minimal thread class implementation. 2017-10-16 21:17:58 +02:00
Bartosz Taudul
dafec48319 PAGE_SIZE is already defined in limits.h. 2017-10-16 21:17:58 +02:00
Bartosz Taudul
65c000718b Do not redefine assert macro. 2017-10-16 21:17:58 +02:00
Bartosz Taudul
31fc2335dd Silence some type mismatch warnings. 2017-10-16 21:17:58 +02:00
Bartosz Taudul
3554e4c4ac Prevent clash of likely/unlikely with possible macros. 2017-10-16 21:17:58 +02:00
Bartosz Taudul
5b9fcddfb3 String literal message transfer. 2017-10-15 13:06:49 +02:00
Bartosz Taudul
95439a726a Fix typo. 2017-10-15 13:06:20 +02:00
Bartosz Taudul
9a60c3fb6e Remove unused variable. 2017-10-14 20:03:55 +02:00
Bartosz Taudul
e496f24427 Use standard c++ features. 2017-10-14 18:48:35 +02:00
Bartosz Taudul
bded83e458 Don't include headers in a namespace. 2017-10-14 18:02:01 +02:00
Bartosz Taudul
dc25c46dee No need to init/destroy queue item memory. 2017-10-14 17:39:43 +02:00
Bartosz Taudul
472b5a521f Preallocation size is in number of elements, not bytes. 2017-10-14 17:33:05 +02:00
Bartosz Taudul
40bc4c8015 Missing include. 2017-10-14 17:21:14 +02:00
Bartosz Taudul
19011b3798 Use rpmalloc in concurrentqueue. 2017-10-14 17:19:27 +02:00
Bartosz Taudul
c497966c7f Use rpmalloc to allocate tracy client memory. 2017-10-14 17:15:18 +02:00
Bartosz Taudul
e8968efea7 Initialize rpmalloc. 2017-10-14 17:00:25 +02:00
Bartosz Taudul
b75317be7d Wrap malloc/free. 2017-10-14 16:52:05 +02:00
Bartosz Taudul
b117c56257 Wrap rpmalloc in tracy namespace. 2017-10-14 16:50:08 +02:00
Bartosz Taudul
709d86ad0c Add rpmalloc.
https://github.com/rampantpixels/rpmalloc/tree/master/rpmalloc
2592b551b26d0ac9d1c92db3c1ae6c0ce5cd447a
2017-10-14 16:43:26 +02:00
Bartosz Taudul
fa8030009f Store messages. 2017-10-14 14:28:04 +02:00
Bartosz Taudul
8c7b60fbe6 Allow sending text messages. 2017-10-14 13:23:13 +02:00
Bartosz Taudul
57afeb4588 Queue MUST allocate memory. 2017-10-13 20:33:53 +02:00
Bartosz Taudul
7f36bb6846 Mark unlikely code path.
It also changes MSVC behavior from generating two jumps to just one.
2017-10-13 20:24:11 +02:00
Bartosz Taudul
1aaab3c5e4 Use 32 bits to store lock id.
This makes queue item size 32 bytes. Queue operations can now be faster,
because multiplication by 33 is replaced by shift by 5.
2017-10-13 20:05:38 +02:00
Bartosz Taudul
ec789d60e8 Store source location color in 24 bits. 2017-10-13 19:59:18 +02:00
Bartosz Taudul
fe0366c792 Receive plot data. 2017-10-13 03:36:59 +02:00
Bartosz Taudul
cb0011755d Prevent type conversions. 2017-10-13 02:21:29 +02:00
Bartosz Taudul
f0484b50ca Plot data transfer. 2017-10-13 02:07:03 +02:00
Bartosz Taudul
737671adbf Remove lock announce message.
This removes problem with static initialization order of mutices vs
tracy.

Lock source location is now transferred in lock wait message.
2017-10-12 20:14:17 +02:00
Bartosz Taudul
c42106f4ff Add named version of TracyLockable. 2017-10-12 20:00:53 +02:00
Bartosz Taudul
e23da05a65 Workaround gcc stupidity. 2017-10-11 01:44:35 +02:00
Bartosz Taudul
77dfefb5d0 Remove one stack address load. 2017-10-11 01:27:22 +02:00
Bartosz Taudul
af3773dc9a Remove one level of indirection. 2017-10-11 01:04:21 +02:00
Bartosz Taudul
cc8b357f09 Avoid excessive stack operations for cpu query. 2017-10-10 23:21:30 +02:00
Bartosz Taudul
75457c1465 Remove +x flag from files. 2017-10-10 21:56:15 +02:00
Bartosz Taudul
2c252226fc Force proper initialization order on gcc. 2017-10-09 00:39:12 +02:00
Bartosz Taudul
ef525067c5 Mark tracy::Lockable<>::Mark() as const. 2017-10-06 17:14:57 +02:00
Bartosz Taudul
9736be0321 Force inline lock operations. 2017-10-06 17:05:31 +02:00
Bartosz Taudul
dcd89f894c Add lock marking. 2017-10-06 16:32:32 +02:00
Bartosz Taudul
5f9228d4e6 Fix typo. 2017-10-05 03:07:26 +02:00
Bartosz Taudul
06a08816bd Include data type in tracy::Lockable name. 2017-10-04 18:32:53 +02:00
Bartosz Taudul
8c90eab044 Let's not worry about lock memory reuse. 2017-10-04 16:51:51 +02:00
Bartosz Taudul
0011573fa9 Send lock events. 2017-10-04 16:45:46 +02:00
Bartosz Taudul
78f8425dc7 Announce lock creation. 2017-10-04 16:16:40 +02:00
Bartosz Taudul
a3ef369a56 Lockable wrapper. 2017-10-04 15:41:02 +02:00
Bartosz Taudul
f8e7f7ed83 Cygwin can't determine process name using winapi. 2017-10-04 01:22:22 +02:00
Bartosz Taudul
3f0bd793fd Send program start time, not connection time. 2017-10-04 00:34:05 +02:00
Bartosz Taudul
b2252de9c8 Send and display program execution date. 2017-10-03 23:26:41 +02:00
Bartosz Taudul
cf07383db8 Send program name in welcome message. 2017-10-03 23:17:58 +02:00
Bartosz Taudul
6485457518 Process name getter. 2017-10-03 23:17:16 +02:00
Bartosz Taudul
b1aa16763b Prevent accesing TLS data twice on gcc. 2017-10-03 16:55:04 +02:00
Bartosz Taudul
d1edd30ca6 Zone ids are unnecessary. 2017-10-03 16:41:32 +02:00
Bartosz Taudul
2fb4c47491 Remember to calibrate timer. 2017-10-03 15:35:43 +02:00
Bartosz Taudul
7b1135239c Use rdtscp when there's no intrinsic. 2017-10-03 15:28:31 +02:00
Bartosz Taudul
9cde85646a Fix typo. 2017-10-03 15:16:48 +02:00
Bartosz Taudul
e01d378f52 More force inlining. 2017-10-03 15:10:25 +02:00
Bartosz Taudul
fe41185dc0 More unique force inline macro name. 2017-10-03 14:51:58 +02:00
Bartosz Taudul
ba037e5798 Do not store tail index in memory. 2017-10-03 14:50:55 +02:00
Bartosz Taudul
dbb90e51b0 Force inlining of the hot path. 2017-10-03 14:39:02 +02:00
Bartosz Taudul
353fda95a3 Expose profiler internals to make it easier for inlining.
concurrentqueue.h doesn't bring any poisonous includes, only STL.
2017-10-03 14:22:49 +02:00
Bartosz Taudul
439a23049d Separate enqueue allocation functionality. 2017-10-03 14:13:46 +02:00
Bartosz Taudul
16a49356a0 Remove redundant variable. 2017-10-03 14:00:06 +02:00
Bartosz Taudul
7b583628ad Remove unused variables. 2017-10-03 13:58:12 +02:00
Bartosz Taudul
a1abf1f015 Record CPU id. 2017-10-01 19:17:08 +02:00
Bartosz Taudul
f46781808c Construct queue items directly in queue memory. 2017-10-01 17:49:45 +02:00
Bartosz Taudul
99b8c4c77e Prevent fake loop from optimizing out. 2017-10-01 17:42:22 +02:00
Bartosz Taudul
7b0cbef0d7 Allow manual queue item memory filling. 2017-10-01 17:14:26 +02:00
Bartosz Taudul
efda50acb1 Send timer resolution to server. 2017-09-29 18:32:07 +02:00
Bartosz Taudul
6a2cb2c14e Calculate timer resolution. 2017-09-29 18:29:39 +02:00
Bartosz Taudul
445d2831ed Explicit conversion. 2017-09-29 18:29:32 +02:00
Bartosz Taudul
b9aa10913a Rename internal enum to avoid #define conflicts. 2017-09-28 21:20:33 +02:00
Bartosz Taudul
6ae62e6e5a Missing include. 2017-09-28 21:10:02 +02:00
Bartosz Taudul
8c1c395cec Allow sending custom zone names. 2017-09-28 19:28:24 +02:00
Bartosz Taudul
a572ded1cc Add missing define in disabled section. 2017-09-28 19:20:19 +02:00
Bartosz Taudul
d1bbb731fc Zone text (custom string) transfer. 2017-09-27 02:18:17 +02:00
Bartosz Taudul
3c0ce01954 Simplify access to queue producer token.
Note that calibration loop needs separate token, as the thread_local
instance is created after the profiler (and its calibration loop).
2017-09-27 01:03:29 +02:00
Bartosz Taudul
842721a754 Make profiler instance static. 2017-09-27 01:03:01 +02:00
Bartosz Taudul
3cc7cc596e Remove GetNewId() from Profiler interface. 2017-09-27 01:02:04 +02:00
Bartosz Taudul
f584bf76e8 Profiler ID can be static (one less instruction). 2017-09-27 00:30:02 +02:00
Bartosz Taudul
e076d1d475 Send source location answer in stream, not as separate packet. 2017-09-26 19:00:25 +02:00
Bartosz Taudul
e90a86e06e Store zone color in source location struct. 2017-09-26 18:54:48 +02:00
Bartosz Taudul
7424077d70 Store source location in a single object.
Source file, function name and line number are now stored in a const
static container object. This has the following benefits:
- Slightly lighter profiling workload (3 instructions less).
- Profiling queue event size is significantly reduced, by 12 bytes. This
  has an effect on all queue event types.
- Source location grouping has now no cost, as it's performed at the
  compilation stage. This allows simplification of server code.
The downside is that the full source location resolution is now
performed in two steps, as the server has to query both source location
container and strings contained within. This has almost no real impact
on profiler operation.
2017-09-26 02:39:08 +02:00
Bartosz Taudul
e5ad7d9ac4 GetTime() call can be now inlined.
No dependencies on either windows.h, or static instance of Profiler.
2017-09-26 00:42:09 +02:00
Bartosz Taudul
11a790a18f Offload TSC -> time conversion to server. 2017-09-26 00:13:24 +02:00
Bartosz Taudul
519cb8dff3 Allow adding custom colors to zones. 2017-09-25 22:46:14 +02:00
Bartosz Taudul
206305fbd2 Merge TracyThread.hpp to TracySystem.cpp.
Keeping threading functions inside a source file prevents poisoning by
including windows.h.
2017-09-25 21:13:59 +02:00
Bartosz Taudul
7683da5f74 Send initial configuration as a single message. 2017-09-24 16:10:28 +02:00
Bartosz Taudul
fce04c6215 Profiling delay calibration. 2017-09-24 16:02:09 +02:00
Bartosz Taudul
bf12704b0f Increase queue preallocation size. 2017-09-24 15:59:53 +02:00
Bartosz Taudul
6a4f3842af Pre-allocate space for 64K events in queue. 2017-09-24 13:40:04 +02:00
Bartosz Taudul
7770014844 Use rdtscp to measure time on windows. 2017-09-23 21:33:05 +02:00
Bartosz Taudul
bd9ffc16b5 Hide GetTime() in Profiler. 2017-09-23 21:10:26 +02:00
Bartosz Taudul
e1a63dbb53 Drop constant merging check.
While without constant merging the profiler operates sub-optimally, it's
not that essential to be enabled. And there are problems with it on some
platforms, for example cygwin.
2017-09-23 20:16:42 +02:00
Bartosz Taudul
031818dff6 Send main thread name. 2017-09-23 01:38:26 +02:00
Bartosz Taudul
2faa1abb21 Store main thread id. 2017-09-23 01:37:07 +02:00
Bartosz Taudul
893db40bb2 Fix signed vs unsigned comparison. 2017-09-22 22:16:18 +02:00
Bartosz Taudul
340bf80435 Better thread name retrieval. 2017-09-22 02:10:36 +02:00
Bartosz Taudul
6525e1b3c1 Thread name queries. 2017-09-22 01:59:44 +02:00
Bartosz Taudul
70ad3407c0 Rework client handling of server requests. 2017-09-22 01:54:04 +02:00
Bartosz Taudul
3ba6046a53 Super bad thread name resolution. 2017-09-22 01:50:14 +02:00
Bartosz Taudul
a557a3fb30 Collect and transmit source thread information. 2017-09-22 01:11:53 +02:00
Bartosz Taudul
b0f94f6b45 Add threading helpers. 2017-09-22 01:11:14 +02:00
Bartosz Taudul
f6e8eb32ec Sort includes. 2017-09-22 00:36:36 +02:00
Bartosz Taudul
36ecf16d59 Add comments to the constant merging assert. 2017-09-19 02:19:27 +02:00
Bartosz Taudul
36fa5af728 Missing header. 2017-09-19 02:19:20 +02:00
Bartosz Taudul
0331d548d2 Automatically create profiler instance. 2017-09-18 19:08:54 +02:00
Bartosz Taudul
9d2fef2f11 Hide implementation details wrt concurrent queue. 2017-09-18 18:51:45 +02:00
Bartosz Taudul
d7914439e9 Use stream compression.
Previously each data packet was compressed independently. After this
change all new packets reference the previously sent data, which
achieves better compression.
2017-09-17 13:10:58 +02:00
Bartosz Taudul
03ece0ac48 Send frame markers. 2017-09-16 00:30:27 +02:00
Bartosz Taudul
ff07576d96 Reply to string requests. 2017-09-14 19:25:16 +02:00
Bartosz Taudul
f61f50385d Add ability to send strings over network. 2017-09-14 19:24:35 +02:00
Bartosz Taudul
f3ce055568 Mirror TracyView::ShouldExit in TracyProfiler. 2017-09-14 19:23:50 +02:00
Bartosz Taudul
2442c8fe58 Use one flag to control whether LZ4 is enabled. 2017-09-14 19:09:14 +02:00
Bartosz Taudul
76df000467 Move sending data to a separate function. 2017-09-14 19:07:56 +02:00
Bartosz Taudul
d999f35dfa Exchange time and id in queue header and data structs. 2017-09-14 01:14:40 +02:00
Bartosz Taudul
10b88754d8 Allow direct access to data size table index. 2017-09-14 01:05:08 +02:00
Bartosz Taudul
52d24d0d4c s_instance ptr may be accessed by thread. 2017-09-13 23:36:40 +02:00
Bartosz Taudul
efd66bb609 Allow changing lz4 size type. 2017-09-13 23:27:17 +02:00
Bartosz Taudul
16dd561029 Move protocol specific sizes to common header. 2017-09-13 22:56:55 +02:00
Bartosz Taudul
a31ab6a256 Move TracyQueue.hpp to common. 2017-09-13 22:56:08 +02:00
Bartosz Taudul
45646c4f45 Move TracySystem to a common directory. 2017-09-13 01:32:11 +02:00
Bartosz Taudul
997f0c64c3 Store pointers as uint64.
Pointers can't be stored as pointers, as that would cause mismatch in
wire protocol between 32 and 64 bit builds.
2017-09-13 01:24:42 +02:00
Bartosz Taudul
e8d64de5c1 Disable LZ4 in debug builds (too slow). 2017-09-12 02:20:05 +02:00
Bartosz Taudul
1ea61c2f2c Use LZ4 to compress network data.
This greatly reduces required network bandwidth, which in effect speeds
up queue processing.

Time to process a single event queue item:

      | Raw data | With LZ4 |
------+----------+----------+
Deque |  6.86 ns |   6.7 ns |
Pack  |  4.03 ns |   4.0 ns |
LZ4   |  ---     |  21.6 ns |
Send  | 214.5 ns |   5.2 ns |
------+----------+----------+
Total | 225.4 ns | 37.58 ns |
2017-09-12 02:13:22 +02:00
Bartosz Taudul
3df4cf8acd Don't send unused data. 2017-09-12 01:14:04 +02:00
Bartosz Taudul
25d7cebd8a Move common event data to separate struct. 2017-09-12 00:56:31 +02:00
Bartosz Taudul
aa10adcc9c Explicitly describe target frame size. 2017-09-12 00:49:38 +02:00
Bartosz Taudul
30ceac359d Increase block size. 2017-09-12 00:46:10 +02:00
Bartosz Taudul
6092c695bd All enqueue operations are performed with a token. 2017-09-12 00:43:25 +02:00
Bartosz Taudul
e04e1580c4 Adjust data size to fully utilize TCP packet size. 2017-09-12 00:38:33 +02:00
Bartosz Taudul
37405bafde Pack queue item. 2017-09-12 00:28:50 +02:00
Bartosz Taudul
8fb8e4f792 No need for sleep, Accept() already sleeps. 2017-09-11 23:16:17 +02:00
Bartosz Taudul
8747da8e2c Send event data over network. 2017-09-11 22:51:47 +02:00
Bartosz Taudul
8d3aae24bf Use producer tokens during event insertion. 2017-09-10 20:52:10 +02:00
Bartosz Taudul
452e5c5c83 Increase bulk size to 1024. 2017-09-10 20:40:28 +02:00
Bartosz Taudul
6886d5035e Dequeue events (and do nothing with them). 2017-09-10 20:23:06 +02:00
Bartosz Taudul
6a7fdea6fd Store profiling start time. 2017-09-10 20:14:16 +02:00
Bartosz Taudul
5964a6864c Scoped zone macro. 2017-09-10 20:10:20 +02:00
Bartosz Taudul
09f9937133 Scoped zone wrapper. 2017-09-10 20:09:57 +02:00
Bartosz Taudul
12a6306c0b Allow queuing zones. 2017-09-10 20:09:14 +02:00
Bartosz Taudul
05486c8225 Add unique event identifier source. 2017-09-10 20:08:42 +02:00
Bartosz Taudul
e4356eb67e Time retrieval function. 2017-09-10 20:07:38 +02:00
Bartosz Taudul
fc1b131c7a Add event queue structures. 2017-09-10 20:06:52 +02:00
Bartosz Taudul
ea9464f4f6 Make sure string constants are at the same memory address. 2017-09-10 20:02:40 +02:00
Bartosz Taudul
b4f8901a8d Add MPMC queue.
https://github.com/cameron314/concurrentqueue.git
b276773a1babd702b020a91ea2443985a65bab11
2017-09-10 19:01:14 +02:00
Bartosz Taudul
4a05da273f Set worker thread name. 2017-09-10 17:46:20 +02:00
Bartosz Taudul
a5d6039aea Profiler worker thread skeleton. 2017-09-10 17:43:56 +02:00