Frame profiler
Go to file
Bartosz Taudul 4422fce55c Don't decompress GpuZone threads while saving trace.
Saving the threads compressed in GPU zones and memory event has the
following result on trace dump sizes:

043/aa.tracy (0.4.3) {10055 KB} -> 044/aa.tracy (0.4.4) {9975 KB}  99.20% size change
043/android.tracy (0.4.3) {542739 KB} -> 044/android.tracy (0.4.4) {519248 KB}  95.67% size change
043/asset-new.tracy (0.4.3) {78403 KB} -> 044/asset-new.tracy (0.4.4) {75899 KB}  96.81% size change
043/asset-new-id.tracy (0.4.3) {84341 KB} -> 044/asset-new-id.tracy (0.4.4) {81771 KB}  96.95% size change
043/asset-old.tracy (0.4.3) {80688 KB} -> 044/asset-old.tracy (0.4.4) {78410 KB}  97.18% size change
043/big.tracy (0.4.3) {939577 KB} -> 044/big.tracy (0.4.4) {938427 KB}  99.88% size change
043/callstack.tracy (0.4.3) {14557 KB} -> 044/callstack.tracy (0.4.4) {14465 KB}  99.37% size change
043/callstack-linux.tracy (0.4.3) {6949 KB} -> 044/callstack-linux.tracy (0.4.4) {6942 KB}  99.90% size change
043/crash.tracy (0.4.3) {131 KB} -> 044/crash.tracy (0.4.4) {127 KB}  97.10% size change
043/crash2.tracy (0.4.3) {1422 KB} -> 044/crash2.tracy (0.4.4) {1412 KB}  99.29% size change
043/darkrl.tracy (0.4.3) {15767 KB} -> 044/darkrl.tracy (0.4.4) {15663 KB}  99.34% size change
043/darkrl2.tracy (0.4.3) {7947 KB} -> 044/darkrl2.tracy (0.4.4) {7886 KB}  99.23% size change
043/darkrl-old.tracy (0.4.3) {67448 KB} -> 044/darkrl-old.tracy (0.4.4) {67004 KB}  99.34% size change
043/deadlock.tracy (0.4.3) {5984 KB} -> 044/deadlock.tracy (0.4.4) {5986 KB}  100.03% size change
043/gn-opengl.tracy (0.4.3) {29005 KB} -> 044/gn-opengl.tracy (0.4.4) {28885 KB}  99.59% size change
043/gn-vulkan.tracy (0.4.3) {29352 KB} -> 044/gn-vulkan.tracy (0.4.4) {29257 KB}  99.68% size change
043/long.tracy (0.4.3) {1182800 KB} -> 044/long.tracy (0.4.4) {1176584 KB}  99.47% size change
043/mem.tracy (0.4.3) {1369067 KB} -> 044/mem.tracy (0.4.4) {1262406 KB}  92.21% size change
043/multi.tracy (0.4.3) {8004 KB} -> 044/multi.tracy (0.4.4) {7944 KB}  99.24% size change
043/new.tracy (0.4.3) {1108 KB} -> 044/new.tracy (0.4.4) {1099 KB}  99.18% size change
043/q3bsp-mt.tracy (0.4.3) {949855 KB} -> 044/q3bsp-mt.tracy (0.4.4) {937574 KB}  98.71% size change
043/q3bsp-st.tracy (0.4.3) {240347 KB} -> 044/q3bsp-st.tracy (0.4.4) {230092 KB}  95.73% size change
043/selfprofile.tracy (0.4.3) {197708 KB} -> 044/selfprofile.tracy (0.4.4) {197659 KB}  99.98% size change
043/tbrowser.tracy (0.4.3) {9503 KB} -> 044/tbrowser.tracy (0.4.4) {9503 KB}  100.00% size change
043/test.tracy (0.4.3) {40700 KB} -> 044/test.tracy (0.4.4) {40699 KB}  100.00% size change
043/virtualfile_hc.tracy (0.4.3) {72424 KB} -> 044/virtualfile_hc.tracy (0.4.4) {72304 KB}  99.83% size change
043/zfile_hc.tracy (0.4.3) {39419 KB} -> 044/zfile_hc.tracy (0.4.4) {39328 KB}  99.77% size change
2019-02-17 00:29:01 +01:00
capture Handle dropped connection in capture utility. 2019-02-12 11:13:53 +01:00
client Replace select() with poll(). 2019-02-10 15:45:23 +01:00
common Handle dropped connection during handshake. 2019-02-12 01:41:09 +01:00
doc Add histogram, compare screenshots to README. 2019-01-15 19:55:41 +01:00
extra X11 colors conversion program. 2018-07-04 18:26:57 +02:00
imgui Update imgui to 1.67. Also update imguicolortextedit. 2019-01-19 14:05:54 +01:00
imguicolortextedit Update imgui to 1.67. Also update imguicolortextedit. 2019-01-19 14:05:54 +01:00
libbacktrace Add modified libbacktrace. 2019-01-20 16:53:45 +01:00
manual Update manual. 2019-02-10 17:33:39 +01:00
nfd Workaround in nfd_win.cpp for MSVC problem in combaseapi.h. 2018-08-01 14:44:39 +02:00
profiler Window position may be negative. 2019-02-12 01:26:14 +01:00
server Don't decompress GpuZone threads while saving trace. 2019-02-17 00:29:01 +01:00
test Function inlining test. 2019-01-20 16:55:09 +01:00
update Display dump file size change in the update utility. 2018-12-30 23:47:43 +01:00
.appveyor.yml Let's try building on ubuntu 1804. 2019-02-08 02:38:26 +01:00
.gitignore Use freetype to render fonts. 2018-08-17 21:40:15 +02:00
AUTHORS Add Dedmen Miller to AUTHORS. 2019-02-08 02:37:30 +01:00
FAQ.md Mention on-demand mode in FAQ. 2018-07-12 13:32:49 +02:00
LICENSE Update year in copyright notice. 2018-12-30 17:51:17 +01:00
NEWS Update NEWS. 2019-02-10 17:25:19 +01:00
README.md Add histogram, compare screenshots to README. 2019-01-15 19:55:41 +01:00
Tracy.hpp Allow forcing call stack capture. 2018-12-13 14:43:37 +01:00
TracyC.h Use language neutral header for callstack capability detection. 2019-01-27 13:41:32 +01:00
TracyClient.cpp Compile with libbacktrace on linux. 2019-01-20 16:55:33 +01:00
TracyClientDLL.cpp Fix builds with MingW. 2019-01-19 13:53:10 +02:00
TracyLua.hpp Explicitly cast size_t to uint32_t. 2018-08-22 16:30:37 +02:00
TracyOpenGL.hpp Allow forcing call stack capture. 2018-12-13 14:43:37 +01:00
TracyVulkan.hpp Hide internals behind TracyVkCtx typedef. 2019-01-14 12:40:54 +01:00

Tracy Profiler

Build status

Tracy is a real time, nanosecond resolution frame profiler that can be used for remote or embedded telemetry of your application. It can profile CPU (C, C++, Lua), GPU (OpenGL, Vulkan) and memory. It also can display locks held by threads and their interactions with each other.

Tracy requires compiler support for C++11, Thread Local Storage and a way to workaround static initialization order fiasco. There are no other requirements. The following platforms are confirmed to be working (this is not a complete list):

  • Windows (x86, x64)
  • Linux (x86, x64, ARM, ARM64)
  • Android (ARM, x86)
  • FreeBSD (x64)
  • Cygwin (x64)
  • WSL (x64)
  • OSX (x64)

The following compilers are supported:

  • MSVC
  • gcc
  • clang

Introduction to Tracy Profiler v0.2
New features in Tracy Profiler v0.3
New features in Tracy Profiler v0.4

A quick FAQ.
List of changes.

High-level overview

Tracy is split into client and server side. The client side collects events using a high-efficiency queue and awaits for an incoming connection. The server part connects to client and receives collected data from the client, which is then reconstructed into a viewable timeline. The transfer is performed using a TCP connection.

Performance impact

To check how much slowdown is introduced by using Tracy, I have profiled etcpak, which is the fastest ETC texture compression utility there is. I used an 8192×8192 test image as input data and instrumented everything down to the 4×4 pixel block compression function (that's 4 million blocks to compress). It should be noted that Tracy needs to calibrate its internal timers at each run. This introduces a delay of 115 ms (on my machine), which is negligible when doing lengthy profiling runs, but it skews the results of etcpak timing. The following times have this delay subtracted, to give focus on zone collection impact, which is the thing that really matters here.

Scenario Zones Clean run Profiling run Difference
Compression of an image to ETC1 format 4194568 0.94 s 1.003 s +0.063 s
Compression of an image to ETC2 format, with mip-maps 5592822 1.034 s 1.119 s +0.085 s

In both scenarios the per-zone time cost is at ~15 ns. This is in line with the measured 8 ns single event collection time (each zone has to report start and end event).

Usage instructions

The user manual for Tracy is available at the following address. It provides information about the integration process, required code markup and so on.

Features

Histogram of function execution times

Comparison of two profiling runs

Marking locks

Plotting data

Message log