Commit Graph

224 Commits

Author SHA1 Message Date
Bartosz Taudul
b76726c597 No need for lean callstack, callstack memory messages. 2020-07-26 14:23:03 +02:00
Bartosz Taudul
b7af9a0860 Reduce frame images frame index to 32 bit. 2020-07-26 13:46:05 +02:00
Bartosz Taudul
c0b73c248f Add second single string transfer. 2020-07-26 01:47:49 +02:00
Bartosz Taudul
e91950f006 Send single string for messages. 2020-07-26 01:35:52 +02:00
Bartosz Taudul
81d5a8db5e Implement transport of single string data.
In most cases only one string is sent per message and no pointer
tracking is needed.

This is only plumbing work, no changes to messages have been made yet.
2020-07-26 01:35:51 +02:00
Bartosz Taudul
02e7893c75 Preserve messages size. 2020-07-21 20:58:58 +02:00
Bartosz Taudul
e4fbf60668 Add SendString() with length parameter. 2020-07-21 20:58:58 +02:00
Bartosz Taudul
2bef3629b7
Merge pull request #74 from avoroshilov/manual-lifetime
Manual lifetime management for Multi-DLL
2020-07-19 12:06:11 +02:00
Bartosz Taudul
b8df7a1302 Expose m_isConnected in non-on-demand builds. 2020-07-16 11:22:06 +02:00
Andrey Voroshilov
cbfb19816b Merge remote-tracking branch 'tracy/master' into manual-lifetime
# Conflicts:
#	AUTHORS
2020-07-13 01:49:11 -07:00
Andrey Voroshilov
4c397ebe1e Fixing some of the copy-paste errors 2020-07-12 10:12:50 -07:00
Andrey Voroshilov
6b790d778d Replacing removing spinlock that is not needed anymore, making TRACY_MANUAL_LIFETIME a sub-option of TRACY_DELAYED_INIT, and addressing feedback 2020-07-12 10:04:07 -07:00
Bartosz Taudul
5e5bf928a5 Add QPC frequency query to API. 2020-07-07 21:25:35 +02:00
Andrey Voroshilov
6a72560989 Fixing functions case to match the source capitalization rules 2020-07-07 03:12:02 -07:00
Andrey Voroshilov
da5e58682f Adding manual lifetime management to aid multi-DLL usecase 2020-07-07 00:39:09 -07:00
Bartosz Taudul
f718761905 Reduce allocated source location size by 2 bytes. 2020-07-05 17:11:15 +02:00
Bartosz Taudul
4179e85029 Add missing parameters. 2020-07-02 17:17:01 +02:00
Bartosz Taudul
4bbeb51e34 Add secure alloc/free macros. 2020-06-24 01:33:26 +02:00
Bartosz Taudul
530e464347 Add checker for profiler availability. 2020-06-24 01:32:57 +02:00
Simonas Kazlauskas
29886435b4 ___tracy_alloc_* take pointer-size pairs
This enables better bindings in languages that do not have 0-terminated
strings for source/function name. It does not introduce any additional
overhead in languages that do use 0-terminated strings, either, but it
_is_ a breaking API change.

Fixes https://github.com/wolfpld/tracy/issues/53
2020-06-20 20:35:42 +03:00
kudansam
1151ec1328 Fix defines when compiling with -Werror=undef
Some ARM defines fail when compiling with -Werror=undef as they rely on
the missing define mapping to 0.
2020-05-22 15:48:59 +02:00
Bartosz Taudul
fad7e72fd4 Harden against uninitialized rpmalloc.
Initialize rpmalloc either by explicitly calling InitRPMallocThread(),
or by forcing initialization of thread local variables block.
2020-05-19 13:51:11 +02:00
txfx
412d252eea Remove extra semicolons at the end of namespaces 2020-05-10 15:32:39 +02:00
Bartosz Taudul
a2187565d1 Optimize non-native-size memcpy. 2020-04-13 13:45:21 +02:00
Bartosz Taudul
b69aaf04e9 Add support for QPC timer. 2020-04-07 22:01:31 +02:00
Bartosz Taudul
b2a8b53efa Query source location of each assembly instruction. 2020-04-01 21:43:03 +02:00
Bartosz Taudul
f114ec3f80 Add code transfer from client to server. 2020-03-25 20:04:55 +01:00
Bartosz Taudul
c6bb08355c Allow specification of port through env variable. 2020-03-08 16:14:36 +01:00
Bartosz Taudul
26cee8acf0 Perform symbol information queries. 2020-02-26 22:35:15 +01:00
Bartosz Taudul
2b7f5091f1 Store sampling period. 2020-02-25 23:08:52 +01:00
Bartosz Taudul
02d200878d Process queue data in-place. 2020-02-23 15:18:24 +01:00
Bartosz Taudul
96034bca3e Force inline AppendData(), NeedDataSize(). 2020-02-23 14:44:19 +01:00
Bartosz Taudul
23fe3e623d 64-bit only version of callstack payload sender. 2020-02-22 14:05:01 +01:00
Bartosz Taudul
f186540c4f Fix callstack pointers in 32-bit builds. 2020-02-22 13:38:09 +01:00
Bartosz Taudul
aa94df0845 Replace rpmalloc_thread_initialize with InitRPMallocThread(). 2020-01-25 17:16:08 +01:00
Bartosz Taudul
ab2fbd6164 Move ParamaterSetup() implementation to header. 2020-01-25 16:51:17 +01:00
Bartosz Taudul
a90004b983 Move Set/GetThreadName() to Tracy API. 2020-01-25 16:36:58 +01:00
Bartosz Taudul
55d03cb03e Hide async queue setup/commit behind macros. 2020-01-19 15:06:11 +01:00
Bartosz Taudul
68ff33d0ba Extract source location allocation functionality. 2019-12-06 00:15:46 +01:00
Bartosz Taudul
59371eef5a Obtain CPU topology on windows. 2019-11-29 18:29:31 +01:00
Bartosz Taudul
4551553eb4 Implement setting client parameters from server. 2019-11-25 23:59:48 +01:00
Bartosz Taudul
8286b0b72f Plumbing for message call stacks. 2019-11-14 23:40:41 +01:00
Bartosz Taudul
0befc75f83 Fix conflicts with X.h. 2019-11-14 18:24:29 +01:00
Bartosz Taudul
6bbf273581 Partial header inclusion cleanup. 2019-11-05 20:09:40 +01:00
Bartosz Taudul
907574e637 Allow remote plot configuration. 2019-11-05 17:45:19 +01:00
Bartosz Taudul
0f2503d334 Send time deltas in GPU time events. 2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176 Send delta times for context switches. 2019-10-25 19:13:11 +02:00
Bartosz Taudul
ba61a9ed84 Transfer time deltas, not absolute times.
This change significantly reduces network bandwidth requirements.

Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
cf88265304 Full 64-bit register is set by rdtsc. 2019-10-21 01:13:55 +02:00
Bartosz Taudul
c774534b47 Use rdtsc instead of rdtscp.
But rdtscp is serializing!

No, it's not. Quoting the Intel Instruction Set Reference:

"The RDTSCP instruction is not a serializing instruction, but it does
wait until all previous instructions have executed and all previous
loads are globally visible. But it does not wait for previous stores to
be globally visible, and subsequent instructions may begin execution
before the read operation is performed.",

"The RDTSC instruction is not a serializing instruction. It does not
necessarily wait until all previous instructions have been executed
before reading the counter. Similarly, subsequent instructions may begin
execution before the read operation is performed."

So, the difference is in waiting for prior instructions to finish
executing. Notice that even in the rdtscp case, execution of the
following instructions may commence before time measurement is finished
and data stores may be still pending.

But, you may say, Intel in its "How to Benchmark Code Execution Times"
document shows that using rdtscp is superior to rdstc. Well, not
exactly. What they do show is that when a *single function* is
considered, there are ways to measure its execution time with little to
no error.

This is not what Tracy is doing.

In our case there is no way to determine absolute "this is before" and
"this is after" points of a zone, as we probably already are inside
another zone.  Stopping the CPU execution, so that a deeply nested zone
may be measured with great precision, will skew the measurements of all
parent zones.

And this is not what we want to measure, anyway. We are not interested
in how a *single function* behaves, but how a *whole program* behaves.
The out-of-order CPU behavior may influence the measurements? Good! We
are interested in that. We want to see *how* the code is really
executed. How is *stopping* the CPU to make a timer read an appropriate
thing to do, when we want to see how a program is performing?

At least that's the theory.

And besides all that, the profiling overhead is now reduced.
2019-10-20 20:52:33 +02:00