d8a9d6d3bf
339d5ef00edcfb849c1281bcf176113199828522 |
||
---|---|---|
capture | ||
client | ||
common | ||
doc | ||
extra | ||
imgui | ||
imguicolortextedit | ||
manual | ||
nfd | ||
profiler | ||
server | ||
test | ||
update | ||
.appveyor.yml | ||
.gitignore | ||
AUTHORS | ||
FAQ.md | ||
LICENSE | ||
NEWS | ||
README.md | ||
Tracy.hpp | ||
TracyClient.cpp | ||
TracyClientDLL.cpp | ||
TracyLua.hpp | ||
TracyOpenGL.hpp | ||
TracyVulkan.hpp |
Tracy Profiler
Tracy is a real time, nanosecond resolution frame profiler that can be used for remote or embedded telemetry of your application. It can profile CPU (C++, Lua), GPU (OpenGL, Vulkan) and memory. It also can display locks held by threads and their interactions with each other.
Tracy requires compiler support for C++11, Thread Local Storage and a way to workaround static initialization order fiasco. There are no other requirements. The following platforms are confirmed to be working (this is not a complete list):
- Windows (x86, x64)
- Linux (x86, x64, ARM, ARM64)
- Android (ARM, x86)
- FreeBSD (x64)
- Cygwin (x64)
- WSL (x64)
- OSX (x64)
The following compilers are supported:
- MSVC
- gcc
- clang
Introduction to Tracy Profiler v0.2
New features in Tracy Profiler v0.3
New features in Tracy Profiler v0.4
High-level overview
Tracy is split into client and server side. The client side collects events using a high-efficiency queue and awaits for an incoming connection. The server part connects to client and receives collected data from the client, which is then reconstructed into a viewable timeline. The transfer is performed using a TCP connection.
Performance impact
To check how much slowdown is introduced by using Tracy, I have profiled etcpak, which is the fastest ETC texture compression utility there is. I used an 8192×8192 test image as input data and instrumented everything down to the 4×4 pixel block compression function (that's 4 million blocks to compress). It should be noted that Tracy needs to calibrate its internal timers at each run. This introduces a delay of 115 ms (on my machine), which is negligible when doing lengthy profiling runs, but it skews the results of etcpak timing. The following times have this delay subtracted, to give focus on zone collection impact, which is the thing that really matters here.
Scenario | Zones | Clean run | Profiling run | Difference |
---|---|---|---|---|
Compression of an image to ETC1 format | 4194568 | 0.94 s | 1.003 s | +0.063 s |
Compression of an image to ETC2 format, with mip-maps | 5592822 | 1.034 s | 1.119 s | +0.085 s |
In both scenarios the per-zone time cost is at ~15 ns. This is in line with the measured 8 ns single event collection time (each zone has to report start and end event).
Usage instructions
The user manual for Tracy is available at the following address. It provides information about the integration process, required code markup and so on.