Everything can be confined to a single thread that does job dispatch,
and then waits for the jobs to finish. TaskDispatch has always executed
outstanding work during this wait, so no workers are needed.
For some unknown reason, local threads may be attributed to an external
process (at least when profiling on Windows). This causes some problems,
for example the CPU usage graph may show that CPU is pegged by some other
program, when it reality it is the profiled program that uses the CPU time.
Workaround by checking first, if the thread id is known to be local by the
profiler, i.e. if there were user-generated events originating from it.
This still leaves other things, such as the CPU data list, being wrong,
but the CPU data is meant to show raw TID -> PID mapping. If the source
data is wrong, there's not much to fix here.
IsThreadStringRetrieved() interface suggested that it can be used for
checking any thread state, as it had an uint64_t id parameter.
The implementation ignored this parameter and checked the status of
failure thread only. This was never an issue because the code using
this function was only checking for the failure thread state.
Fixed by renaming the function to explicitly state what it does and
removing the thread id parameter.
Without this correction the code would combine all lock regions according
to the minimum visibility range rules, and assign the combined area the
highest lock state within all items. This could produce quote long combined
lock regions, where apparently lock contention happened.
Combined lock regions should instead be split to show exactly where the
lock contention is present. Combining is still performed here, but only
within the minimum visibility range.
This new behavior was also present previously, but was mistakenly omitted
during code refactor.