
This fixes the llvm-support build that generates the profile data, and wraps the whole `cmake --build` command with perf instead of wrapping each individual clang invocation. This limits the number of profile files generated and reduces the time spent running perf2bolt.