On gfx12+, the unified` s_barrier` is lowered to split
`s_barrier_signal/s_barrier_wait` pairs. By default, the dependency edge
between signal and wait has zero latency, causing the scheduler to emit
them adjacent to each other. This misses the opportunity to hide barrier
latency.
This patch adds synthetic latency to the signal-wait barrier edge to
encourage latency hiding. Independent instructions are scheduled in the
gap between split barrier signal and wait.
The latency is tunable via -amdgpu-barrier-signal-wait-latency.
Fixes: SWDEV-567090