6 Commits

Author SHA1 Message Date
Walter Erquinigo
9f45f23d86 [trace][intelpt] Support system-wide tracing [21] - Support long numbers in JSON
llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.

Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.

Differential Revision: https://reviews.llvm.org/D127819
2022-06-16 11:42:22 -07:00
Walter Erquinigo
6a5355e8a1 [trace][intelpt] Support system-wide tracing [20] - Rename some fields in the schema
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema

- traceBuffer -> iptTrace
- core -> cpu

Differential Revision: https://reviews.llvm.org/D127817
2022-06-16 11:42:22 -07:00
Walter Erquinigo
03cc58ff2a [trace][intelpt] Support system-wide tracing [17] - Some improvements
This improves several things and addresses comments up to the diff [11] in this stack.

- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.

Differential Revision: https://reviews.llvm.org/D127456
2022-06-16 11:23:02 -07:00
Walter Erquinigo
ff15efc1a7 [trace][intelpt] Support system-wide tracing [16] - Create threads automatically from context switch data in the post-mortem case
For some context, The context switch data contains information of which threads were
executed by each traced process, therefore it's not necessary to specify
them in the trace file.

So this diffs adds support for that automatic feature. Eventually we
could include it to live processes as well.

Differential Revision: https://reviews.llvm.org/D127001
2022-06-16 11:23:02 -07:00
Walter Erquinigo
ef9970759b [trace][intelpt] Support system-wide tracing [15] - Make triple optional
The process triple should only be needed when LLDB can't identify the correct
triple on its own. Examples could be universal mach-o binaries. But in any case,
at least for most of ELF files, LLDB should be able to do the job without having
the user specify the triple manually.

Differential Revision: https://reviews.llvm.org/D126990
2022-06-16 11:23:01 -07:00
Walter Erquinigo
a19fcc2bec [trace][intelpt] Support system-wide tracing [14] - Decode per cpu
This is the final functional patch to support intel pt decoding per cpu.
It works by doing the following:

- First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core.
- Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons.
- With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded.
- Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.

I'm adding a test as well.

Differential Revision: https://reviews.llvm.org/D126394
2022-06-16 11:23:01 -07:00