Bartosz Taudul
e4fbf60668
Add SendString() with length parameter.
2020-07-21 20:58:58 +02:00
Bartosz Taudul
2bef3629b7
Merge pull request #74 from avoroshilov/manual-lifetime
...
Manual lifetime management for Multi-DLL
2020-07-19 12:06:11 +02:00
Bartosz Taudul
b8df7a1302
Expose m_isConnected in non-on-demand builds.
2020-07-16 11:22:06 +02:00
Andrey Voroshilov
cbfb19816b
Merge remote-tracking branch 'tracy/master' into manual-lifetime
...
# Conflicts:
# AUTHORS
2020-07-13 01:49:11 -07:00
Andrey Voroshilov
4c397ebe1e
Fixing some of the copy-paste errors
2020-07-12 10:12:50 -07:00
Andrey Voroshilov
6b790d778d
Replacing removing spinlock that is not needed anymore, making TRACY_MANUAL_LIFETIME
a sub-option of TRACY_DELAYED_INIT
, and addressing feedback
2020-07-12 10:04:07 -07:00
Bartosz Taudul
5e5bf928a5
Add QPC frequency query to API.
2020-07-07 21:25:35 +02:00
Andrey Voroshilov
6a72560989
Fixing functions case to match the source capitalization rules
2020-07-07 03:12:02 -07:00
Andrey Voroshilov
da5e58682f
Adding manual lifetime management to aid multi-DLL usecase
2020-07-07 00:39:09 -07:00
Bartosz Taudul
f718761905
Reduce allocated source location size by 2 bytes.
2020-07-05 17:11:15 +02:00
Bartosz Taudul
4179e85029
Add missing parameters.
2020-07-02 17:17:01 +02:00
Bartosz Taudul
4bbeb51e34
Add secure alloc/free macros.
2020-06-24 01:33:26 +02:00
Bartosz Taudul
530e464347
Add checker for profiler availability.
2020-06-24 01:32:57 +02:00
Simonas Kazlauskas
29886435b4
___tracy_alloc_*
take pointer-size pairs
...
This enables better bindings in languages that do not have 0-terminated
strings for source/function name. It does not introduce any additional
overhead in languages that do use 0-terminated strings, either, but it
_is_ a breaking API change.
Fixes https://github.com/wolfpld/tracy/issues/53
2020-06-20 20:35:42 +03:00
kudansam
1151ec1328
Fix defines when compiling with -Werror=undef
...
Some ARM defines fail when compiling with -Werror=undef as they rely on
the missing define mapping to 0.
2020-05-22 15:48:59 +02:00
Bartosz Taudul
fad7e72fd4
Harden against uninitialized rpmalloc.
...
Initialize rpmalloc either by explicitly calling InitRPMallocThread(),
or by forcing initialization of thread local variables block.
2020-05-19 13:51:11 +02:00
txfx
412d252eea
Remove extra semicolons at the end of namespaces
2020-05-10 15:32:39 +02:00
Bartosz Taudul
a2187565d1
Optimize non-native-size memcpy.
2020-04-13 13:45:21 +02:00
Bartosz Taudul
b69aaf04e9
Add support for QPC timer.
2020-04-07 22:01:31 +02:00
Bartosz Taudul
b2a8b53efa
Query source location of each assembly instruction.
2020-04-01 21:43:03 +02:00
Bartosz Taudul
f114ec3f80
Add code transfer from client to server.
2020-03-25 20:04:55 +01:00
Bartosz Taudul
c6bb08355c
Allow specification of port through env variable.
2020-03-08 16:14:36 +01:00
Bartosz Taudul
26cee8acf0
Perform symbol information queries.
2020-02-26 22:35:15 +01:00
Bartosz Taudul
2b7f5091f1
Store sampling period.
2020-02-25 23:08:52 +01:00
Bartosz Taudul
02d200878d
Process queue data in-place.
2020-02-23 15:18:24 +01:00
Bartosz Taudul
96034bca3e
Force inline AppendData(), NeedDataSize().
2020-02-23 14:44:19 +01:00
Bartosz Taudul
23fe3e623d
64-bit only version of callstack payload sender.
2020-02-22 14:05:01 +01:00
Bartosz Taudul
f186540c4f
Fix callstack pointers in 32-bit builds.
2020-02-22 13:38:09 +01:00
Bartosz Taudul
aa94df0845
Replace rpmalloc_thread_initialize with InitRPMallocThread().
2020-01-25 17:16:08 +01:00
Bartosz Taudul
ab2fbd6164
Move ParamaterSetup() implementation to header.
2020-01-25 16:51:17 +01:00
Bartosz Taudul
a90004b983
Move Set/GetThreadName() to Tracy API.
2020-01-25 16:36:58 +01:00
Bartosz Taudul
55d03cb03e
Hide async queue setup/commit behind macros.
2020-01-19 15:06:11 +01:00
Bartosz Taudul
68ff33d0ba
Extract source location allocation functionality.
2019-12-06 00:15:46 +01:00
Bartosz Taudul
59371eef5a
Obtain CPU topology on windows.
2019-11-29 18:29:31 +01:00
Bartosz Taudul
4551553eb4
Implement setting client parameters from server.
2019-11-25 23:59:48 +01:00
Bartosz Taudul
8286b0b72f
Plumbing for message call stacks.
2019-11-14 23:40:41 +01:00
Bartosz Taudul
0befc75f83
Fix conflicts with X.h.
2019-11-14 18:24:29 +01:00
Bartosz Taudul
6bbf273581
Partial header inclusion cleanup.
2019-11-05 20:09:40 +01:00
Bartosz Taudul
907574e637
Allow remote plot configuration.
2019-11-05 17:45:19 +01:00
Bartosz Taudul
0f2503d334
Send time deltas in GPU time events.
2019-10-25 19:52:01 +02:00
Bartosz Taudul
8fa5188176
Send delta times for context switches.
2019-10-25 19:13:11 +02:00
Bartosz Taudul
ba61a9ed84
Transfer time deltas, not absolute times.
...
This change significantly reduces network bandwidth requirements.
Implemented for:
- CPU zones,
- GPU zones,
- locks,
- plots,
- memory events.
2019-10-24 00:06:41 +02:00
Bartosz Taudul
cf88265304
Full 64-bit register is set by rdtsc.
2019-10-21 01:13:55 +02:00
Bartosz Taudul
c774534b47
Use rdtsc instead of rdtscp.
...
But rdtscp is serializing!
No, it's not. Quoting the Intel Instruction Set Reference:
"The RDTSCP instruction is not a serializing instruction, but it does
wait until all previous instructions have executed and all previous
loads are globally visible. But it does not wait for previous stores to
be globally visible, and subsequent instructions may begin execution
before the read operation is performed.",
"The RDTSC instruction is not a serializing instruction. It does not
necessarily wait until all previous instructions have been executed
before reading the counter. Similarly, subsequent instructions may begin
execution before the read operation is performed."
So, the difference is in waiting for prior instructions to finish
executing. Notice that even in the rdtscp case, execution of the
following instructions may commence before time measurement is finished
and data stores may be still pending.
But, you may say, Intel in its "How to Benchmark Code Execution Times"
document shows that using rdtscp is superior to rdstc. Well, not
exactly. What they do show is that when a *single function* is
considered, there are ways to measure its execution time with little to
no error.
This is not what Tracy is doing.
In our case there is no way to determine absolute "this is before" and
"this is after" points of a zone, as we probably already are inside
another zone. Stopping the CPU execution, so that a deeply nested zone
may be measured with great precision, will skew the measurements of all
parent zones.
And this is not what we want to measure, anyway. We are not interested
in how a *single function* behaves, but how a *whole program* behaves.
The out-of-order CPU behavior may influence the measurements? Good! We
are interested in that. We want to see *how* the code is really
executed. How is *stopping* the CPU to make a timer read an appropriate
thing to do, when we want to see how a program is performing?
At least that's the theory.
And besides all that, the profiling overhead is now reduced.
2019-10-20 20:52:33 +02:00
Bartosz Taudul
30fc2f02ab
Omit calculation of on-stack variable address.
2019-10-20 19:42:29 +02:00
Bartosz Taudul
7cf3608493
Avoid unused variables.
2019-10-05 02:11:45 +02:00
Bartosz Taudul
fe7f56b022
Implement retrieval of external process names.
2019-08-16 19:22:23 +02:00
Bartosz Taudul
69077e4e6f
Finish sending context switches during disconnect.
2019-08-14 23:06:13 +02:00
Bartosz Taudul
c0b524d8de
Add a separate method for clearing serial queue.
2019-08-14 22:39:12 +02:00
Bartosz Taudul
5fbb811f5d
Degrade ARM timer to monotonic raw clock.
...
The monotonic raw clock has the same accuracy as reading cntvct registers, but
using clock_gettime() has a measurable impact on queueing time (135 us vs
83 us).
This change is needed to enable ftrace time readings on ARM linux, which
doesn't provide any way to get raw cntvct readings, like x86-tsc on x86.
2019-08-14 16:19:02 +02:00
Bartosz Taudul
602c38c6c0
Allow checking timer implementation.
2019-08-14 14:35:44 +02:00
Bartosz Taudul
fe0f1aea07
Add system tracing skeleton.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
8aa0be39d5
Drop support for CPU id queries.
2019-08-12 23:05:34 +02:00
Bartosz Taudul
0431c03556
Add serial queue interface.
2019-08-12 13:27:15 +02:00
Bartosz Taudul
12969ee497
Track thread context.
...
This change exploits the fact that events are processed in batches
originating from a single thread. A single message changing thread
context is enough to handle multiple messages, as opposed to inclusion
of thread identifier in each message.
2019-08-02 20:18:08 +02:00
Bartosz Taudul
a4e7a341c0
Proper handling of disconnect request.
2019-08-01 23:14:09 +02:00
Bartosz Taudul
a6a3f45810
Fill in thread id during dequeue, not during enqueue.
2019-07-30 01:15:14 +02:00
Bartosz Taudul
89928fde7b
Queue must be always able to alloc.
2019-07-29 22:13:16 +02:00
Bartosz Taudul
82a4a6d9cc
Add tracy_ prefix to concurrentqueue.h file name.
2019-07-29 21:47:50 +02:00
Alex
0c5ea710b0
Merged in z33ky/tracy/const-frame-image (pull request #37 )
...
Constify frame-image pointer in API.
2019-07-13 13:09:21 +00:00
Alexander 'z33ky' Hirsch
c6e8dc8d63
Constify frame-image pointer in API.
2019-07-13 12:33:55 +02:00
Bartosz Taudul
60d2384a6a
Allow sending application information messages.
2019-07-12 18:34:46 +02:00
Bartosz Taudul
bb35f9a897
Compress frame images in a separate thread.
2019-06-27 13:24:35 +02:00
Bartosz Taudul
7ebd2162c6
Add ETC1 compression thread.
2019-06-26 22:57:24 +02:00
Bartosz Taudul
f565e11976
Store frame images in queue.
2019-06-26 22:52:24 +02:00
Bartosz Taudul
8ce41b3543
Proper init order of thread local thread handle.
2019-06-26 19:32:52 +02:00
Bartosz Taudul
06a41708a7
Move TLS accesses close together.
2019-06-24 19:38:44 +02:00
Bartosz Taudul
0b394c3f53
Don't need to keep last broadcast time in Profiler class.
2019-06-18 20:15:09 +02:00
Bartosz Taudul
e609c0fdce
UDP broadcast loop.
2019-06-17 02:25:09 +02:00
Bartosz Taudul
37d1457b44
Frame image may need flipping.
2019-06-12 15:28:32 +02:00
Bartosz Taudul
04dd33f5c4
Fix mismatched linkage.
2019-06-11 23:51:12 +02:00
Rokas K. (rku)
c4e05b6264
Merged in rokups/tracy/dllimport-cleanup (pull request #36 )
...
Clean up imported functions in multi-dll projects.
Approved-by: Till Rathmann <till.rathmann@gmx.de>
2019-06-11 15:04:34 +00:00
Bartosz Taudul
80dff1ede1
Add connection id for on-demand mode.
...
Long-lived zones could send their end events without begin events in a
following scenario:
1. On-demand connection is made.
2. Zone begin is emitted, m_active is set to true.
3. Connection is terminated.
4. A new connection is made.
5. Zone end is emitted, because m_active is true.
To this point it was assumed that all zone end events will happen before
a new connection is made, but it's not necessarily true.
2019-06-09 17:15:47 +02:00
Bartosz Taudul
cc5bad294a
More strict memory ordering for on-demand connection status.
2019-06-09 16:48:00 +02:00
Bartosz Taudul
23e7850162
Make DequeueStatus enum class.
2019-06-09 16:14:30 +02:00
Bartosz Taudul
4c2ff80ac8
Restore frame counting for on-demand mode.
2019-06-09 15:23:01 +02:00
Bartosz Taudul
784c4da53a
Include frame offset in frame image message.
2019-06-07 20:09:29 +02:00
Rokas Kupstys
9bd1037347
Clean up imported functions in multi-dll projects.
2019-06-07 19:50:08 +03:00
Bartosz Taudul
d271634a95
Keep one ETC1 compression buffer.
2019-06-07 01:29:24 +02:00
Bartosz Taudul
a654b642ef
Compress frame images to ETC1 before sending.
2019-06-07 00:31:51 +02:00
Bartosz Taudul
e5bb6011c5
Frame image transfer prototype.
2019-06-06 21:39:54 +02:00
Bartosz Taudul
efc54babe3
Transfer of colored messages.
2019-05-10 20:17:44 +02:00
Bartosz Taudul
9ec8704dad
Don't include LZ4 headers in tracy headers.
...
The LZ4 implementation is wrapped in tracy namespace, but it also adds
some defines, which may conflict with other LZ4 implementations.
2019-05-01 12:57:42 +02:00
Bartosz Taudul
ec73178733
Move callstack cutting to a separate function.
2019-03-05 02:42:51 +01:00
Bartosz Taudul
e3c31e4a4e
Send callstack alloc payload.
2019-03-03 18:05:03 +01:00
Bartosz Taudul
d863245b49
Serialize discontinuous frame messages.
2019-02-28 19:21:23 +01:00
Bartosz Taudul
9f4f5bcb63
CPU usage retrieval.
2019-02-21 22:45:53 +01:00
Bartosz Taudul
44009b6fda
Use mach_absolute_time() to get time on iOS.
2019-02-21 14:45:13 +01:00
Bartosz Taudul
ef5e30056e
Implement delayed initialization of the profiler.
...
Enabled on osx, ios.
2019-02-19 20:43:30 +01:00
Bartosz Taudul
3f914834b7
Hide rest of statics.
2019-02-19 19:33:37 +01:00
Bartosz Taudul
9fabafbeca
Fix DLL code.
2019-02-19 18:46:59 +01:00
Bartosz Taudul
2421e05c27
Prevent direct access to s_profiler.
2019-02-19 18:38:08 +01:00
Bartosz Taudul
d865d1cc87
Disallow direct access to s_token.
2019-02-19 18:27:00 +01:00
Rokas Kupstys
8157e3a0b3
Fix builds with MingW.
2019-01-19 13:53:10 +02:00
Bartosz Taudul
970108fbbf
Track zone id for verification purposes.
2019-01-14 22:36:54 +01:00
Bartosz Taudul
1f0d1fdfdc
C API prototype.
2019-01-14 21:07:29 +01:00
Bartosz Taudul
070888f80d
Make it possible to have multiple vulkan contexts.
...
API change!
2019-01-10 17:11:17 +01:00
Bartosz Taudul
facb05f8cb
Don't mark FastVector element as used until it's ready.
...
This should prevent a race condition that would result in invalid last
element of the queue, in case a freezed thread already got the queue
item, but didn't wrote to it (or didn't wrote fully).
2018-08-20 22:35:50 +02:00
Bartosz Taudul
d1adf9e8d6
Allow skipping functions on top of call stack.
...
Note that this is on-client performance intensive and shouldn't be used,
except in special situations, like processing crashes.
2018-08-20 22:20:44 +02:00
Bartosz Taudul
b371003336
In case of manual shutdown, don't wait for lock.
...
All threads are freezed at this point, nothing will release it.
2018-08-20 21:49:23 +02:00