tracy

mirror of https://github.com/wolfpld/tracy.git synced 2024-11-10 18:51:46 +00:00

Author	SHA1	Message	Date
Bartosz Taudul	4ef2ce4622	Fix _mm256_cvtsi256_si32 on gcc.	2019-12-12 02:13:12 +01:00
Bartosz Taudul	129b80ef0f	Free source location, if zone is not active.	2019-12-06 00:42:42 +01:00
Bartosz Taudul	b9cdf2cbb7	Expose srcloc allocation in C API.	2019-12-06 00:25:52 +01:00
Bartosz Taudul	399b87fecc	Add allocated srcloc zone begin emit functions to C API.	2019-12-06 00:22:49 +01:00
Bartosz Taudul	68ff33d0ba	Extract source location allocation functionality.	2019-12-06 00:15:46 +01:00
Bartosz Taudul	e8fcc250a1	Report CPU topology on Linux.	2019-11-30 01:51:29 +01:00
Bartosz Taudul	712403e9fd	Transfer, display, save CPU topology data.	2019-11-29 22:41:41 +01:00
Bartosz Taudul	59371eef5a	Obtain CPU topology on windows.	2019-11-29 18:29:31 +01:00
thedmd	a1e2c533f6	libbacktrace: Add support for Mach-O (dSYM) `macho.cpp` was backported from official libbacktrace repository.	2019-11-29 12:04:47 +01:00
Bartosz Taudul	a7d2d5f08b	Fix DeferItem() call.	2019-11-26 01:10:50 +01:00
Bartosz Taudul	4551553eb4	Implement setting client parameters from server.	2019-11-25 23:59:48 +01:00
Bartosz Taudul	c5c9dfb0c9	Native callstacks are now optional in allocated callstack messages.	2019-11-25 22:54:10 +01:00
Bartosz Taudul	37eef59d54	Implement reading sys time on BSD.	2019-11-21 20:41:57 +01:00
Bartosz Taudul	c7a22cc1ff	Use libbacktrace on BSD.	2019-11-21 20:41:57 +01:00
Bartosz Taudul	bd7b0a8197	Support callstack capture on BSD.	2019-11-21 02:34:42 +01:00
Bartosz Taudul	c79449a6a1	Get proper program name on BSD.	2019-11-21 02:16:12 +01:00
Bartosz Taudul	7940977dba	Report physical memory size on BSD.	2019-11-21 02:14:08 +01:00
Bartosz Taudul	3854ae11b2	Revert "Remove dead code." This reverts commit `a36b73f745`.	2019-11-17 17:38:02 +01:00
Bartosz Taudul	a36b73f745	Remove dead code.	2019-11-16 18:34:05 +01:00
Bartosz Taudul	8286b0b72f	Plumbing for message call stacks.	2019-11-14 23:40:41 +01:00
Bartosz Taudul	0befc75f83	Fix conflicts with X.h.	2019-11-14 18:24:29 +01:00
Bartosz Taudul	655864eb7c	Enable crash handler on cygwin. Crash is properly recorded, but the profiler hangs while waiting for shutdown finish.	2019-11-07 19:20:13 +01:00
Bartosz Taudul	3fd74a92f9	Native threads are used on mingw.	2019-11-07 19:02:54 +01:00
Bartosz Taudul	351e220d30	Don't calculate queue delay if delayed init is used. Queue calibration requires queue access during profiler construction. This in turn requires construction of profiler data block, which at this point is underway, because the profiler is being constructed.	2019-06-19 17:29:04 +02:00
Bartosz Taudul	c98f1f0b6b	Make sure profiler is initialized only once in delayed init scenario.	2019-06-19 17:28:18 +02:00
Bartosz Taudul	d4f58ddaf3	Use native windows threads on cygwin, mingw.	2019-11-06 01:42:14 +01:00
Bartosz Taudul	ca198e44d3	Remove dead code from concurrentqueue.	2019-11-05 21:40:52 +01:00
Bartosz Taudul	b5590ed197	Include <mutex> for std::once.	2019-11-05 21:40:35 +01:00
Bartosz Taudul	3e9bb80217	More header cleanup.	2019-11-05 20:15:53 +01:00
Bartosz Taudul	6bbf273581	Partial header inclusion cleanup.	2019-11-05 20:09:40 +01:00
Bartosz Taudul	907574e637	Allow remote plot configuration.	2019-11-05 17:45:19 +01:00
Bartosz Taudul	f34609fd9b	Set per-cpu kernel buffer size to 512 KB. The default setting was causing events to be lost on Android.	2019-11-03 21:52:20 +01:00
Bartosz Taudul	b8d459d48b	Use proper string size (for consistency). On Android code path this value is ignored.	2019-11-03 21:51:49 +01:00
Bartosz Taudul	ca0fae33d1	Remove obsolete assert. Before-terminate-events now include events that have time delta processing, with no memory to free.	2019-11-01 20:10:24 +01:00
Bartosz Taudul	1f0c18882c	Don't collect sys time after application has exited.	2019-10-29 23:05:14 +01:00
Bartosz Taudul	0f2503d334	Send time deltas in GPU time events.	2019-10-25 19:52:01 +02:00
Bartosz Taudul	8fa5188176	Send delta times for context switches.	2019-10-25 19:13:11 +02:00
Bartosz Taudul	25b3cdc1ee	Send thread wakeups when handling disconnect request.	2019-10-25 18:22:42 +02:00
Bartosz Taudul	04b132b6e2	Check if requested data size doesn't overflow buffer.	2019-10-24 21:22:22 +02:00
Bartosz Taudul	ba61a9ed84	Transfer time deltas, not absolute times. This change significantly reduces network bandwidth requirements. Implemented for: - CPU zones, - GPU zones, - locks, - plots, - memory events.	2019-10-24 00:06:41 +02:00
Bartosz Taudul	cf88265304	Full 64-bit register is set by rdtsc.	2019-10-21 01:13:55 +02:00
Bartosz Taudul	07b66cd4ab	Move fake source location out of loop.	2019-10-20 22:18:05 +02:00
Bartosz Taudul	909503403b	Simplify delay calibration.	2019-10-20 22:13:29 +02:00
Bartosz Taudul	c774534b47	Use rdtsc instead of rdtscp. But rdtscp is serializing! No, it's not. Quoting the Intel Instruction Set Reference: "The RDTSCP instruction is not a serializing instruction, but it does wait until all previous instructions have executed and all previous loads are globally visible. But it does not wait for previous stores to be globally visible, and subsequent instructions may begin execution before the read operation is performed.", "The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed." So, the difference is in waiting for prior instructions to finish executing. Notice that even in the rdtscp case, execution of the following instructions may commence before time measurement is finished and data stores may be still pending. But, you may say, Intel in its "How to Benchmark Code Execution Times" document shows that using rdtscp is superior to rdstc. Well, not exactly. What they do show is that when a single function is considered, there are ways to measure its execution time with little to no error. This is not what Tracy is doing. In our case there is no way to determine absolute "this is before" and "this is after" points of a zone, as we probably already are inside another zone. Stopping the CPU execution, so that a deeply nested zone may be measured with great precision, will skew the measurements of all parent zones. And this is not what we want to measure, anyway. We are not interested in how a single function behaves, but how a whole program behaves. The out-of-order CPU behavior may influence the measurements? Good! We are interested in that. We want to see how the code is really executed. How is stopping the CPU to make a timer read an appropriate thing to do, when we want to see how a program is performing? At least that's the theory. And besides all that, the profiling overhead is now reduced.	2019-10-20 20:52:33 +02:00
Bartosz Taudul	30fc2f02ab	Omit calculation of on-stack variable address.	2019-10-20 19:42:29 +02:00
Bartosz Taudul	c3870f8837	Use proper type.	2019-10-10 20:30:08 +02:00
Bartosz Taudul	707f113bda	Add missing NOMINMAX definitions.	2019-10-10 20:29:06 +02:00
Bartosz Taudul	7cf3608493	Avoid unused variables.	2019-10-05 02:11:45 +02:00
Bartosz Taudul	e481b5ba22	Add missing thread sent indication.	2019-10-04 19:18:47 +02:00
Bartosz Taudul	9e1935f070	Make C API symbols visible across dlls.	2019-10-03 22:39:26 +02:00

1 2 3 4 5 ...

666 Commits