Without this correction the code would combine all lock regions according
to the minimum visibility range rules, and assign the combined area the
highest lock state within all items. This could produce quote long combined
lock regions, where apparently lock contention happened.
Combined lock regions should instead be split to show exactly where the
lock contention is present. Combining is still performed here, but only
within the minimum visibility range.
This new behavior was also present previously, but was mistakenly omitted
during code refactor.
Just display known running regions, keeping the unended ones infinitely
collapsed. This makes the CPU core usage graph possibly display wrong
data at the end of capture. Note that this behavior was also present in
previous releases.
We need to know if samples, context switches and messages are present to be
able to correctly calculate thread height. However, if the thread is not
visible, it is not necessary to provide a list of items to draw.
The prev == it condition could only fire on the first run of the loop,
and on all subsequent runs prev (=next-1) will never be "it" anymore.
Lack of this condition changes nothing, as the following lines checking
for time distance between next and prev satisfy the same exit condition
(i.e. next-1 will be "it" only if lower_bound does not find anything,
hence next is farther away than MinVisNs).
The folding process starts at the "next" item. The nextTime variable
represents a time point before which everything should be folded, because
all items in that range are smaller than MinVis range.
The lower_bound search finds a new "next" item, which will be beyond the
nextTime range. But nextTime has origin in the previous "next" item, which
may be not the last item in the folding range. If the distance between the
new "next" and the item before is smaller than MinVis, then the new "next"
item is also folded and the folding loop must continue to run.
There are various changes involved into making this work:
1. Zone size (zsz) is no longer clamped to the timeline viewport area.
This clamping has to be removed to prevent otherwise uncollapsed zones
from apparently becoming small near the viewport borders. Such a small
zone would then be collapsed, resulting in unwanted popping.
Interesingly, only the CPU zones were clamped before. GPU zones were
not.
2. Iteration over visible zones has to start before the visible timeline
viewport area. Without this some zones that would be otherwise
included in the collapsed area (started by a previous zone) may be
fully visible. This causes child zones to be drawn and produces
unwanted popping. (At this point threshold for continuing collapsed
area is greater than threshold for starting it.)
3. Since the iteration now starts before timeline visible area, it may so
happen that everything found will be in a small slice of timeline that
is outside the screen. To fix this, the end time of last found item is
checked against the viewport start time.
It is always valid to access *(zitend-1), as it is in each case done
after null set check (it == zitend).
Similar but simpler fix was also applied to per-thread call stack samples.
The reasoning is that you want to use the color to see where a zone of
a particular type is placed. When collapsed zones go back to displaying
thread color, you may mistake such region of collapsed zones for something
they aren't.
Previously on demand mode was determined by frame offset parameter being
greater than zero. However, if the application is not pumping frames with
FrameMark macro, the frame index will never increase and the frame offset
parameter stay at zero. It is not possible to distinguish on demand traces
from normal ones in this scenario.
Fix by explicitly saving the on demand flag in trace file and employ the
previous logic to set the flag when importing older traces.