This change improves the matching algorithm by using the diff algorithm,
the current matching algorithm only processes the callsites grouped by
the same name functions, it doesn't consider the order relationships
between different name functions, this sometimes fails to handle this
ambiguous anchor case. For example. (`Foo:1` means a
calliste[callee_name: callsite_location])
```
IR : foo:1 bar:2 foo:4 bar:5
Profile : bar:3 foo:5 bar:6
```
The `foo:1` is matched to the 2nd `foo:5` and using the diff
algorithm(finding longest common subsequence ) can help on this issue.
One well-known diff algorithm is the Myers diff algorithm(paper "An
O(ND) Difference Algorithm and Its Variations∗" Eugene W. Myers), its
variations have been implemented and used in many famous tools, like the
GNU diff or git diff. It provides an efficient way to find the longest
common subsequence or the shortest edit script through graph searching.
There are several variations/refinements for the algorithm, but as in
our case, the num of function callsites is usually very small, so we
implemented the basic greedy version in this change which should be good
enough.
We observed better matchings and positive perf improvement on our
internal services.
Currently the code uses FunctionSamples::getCallSiteIdentifier which
will sometimes incorrectly guess that FSAFDO discriminators are probe
based and will convert them incorrectly.
This change doesn't affect builds which don't use FSAFDO, it only fixes
sample profile matching with FS discriminators.
The test for this is manually updated to use discriminator value 15,
which is a perfectly valid base discriminator in the FS world, but
satisfies `isPseudoProbeDiscriminator`, so
`getBaseDiscriminatorFromDiscriminator` will incorrectly extract the
probe index from it.
Note: this change only affects how the base discriminators will be
extracted when doing stale profile matching in the IR-level sample
profile loader. It doesn't add stale profile matching to the MIR-level
FS profile loader pass.
Currently -salvage-stale-profile is a no-op if the profile is not
probe-based. We observed that it can help for regular, non-probe- based
profiles too: some of our internal benchmarks show 0.2-0.3% QPS
improvement.
There seems to be no good reason to limit this flag to only work for
probe-based profiles.