The createSIMachineScheduler & createPostMachineScheduler
target hooks are currently placed in the PassConfig interface.
Moving it out to TargetMachine so that both legacy and
the new pass manager can effectively use them.
Fixed a bug in stall cycle calculation.
When a register defined by an instruction in the current iteration is
used by an instruction in the next iteration, we have modified the
number of stall cycle that need to be inserted.
Added some logic to check loop-carried phis in the window scheduler. It now includes the scenario where the preceding phi uses the virtual register defined by the succeeding phi.
This commit implements the Window Scheduler as described in the RFC:
https://discourse.llvm.org/t/rfc-window-scheduling-algorithm-for-machinepipeliner-in-llvm/74718
This Window Scheduler implements the window algorithm designed by
Steven Muchnick in the book "Advanced Compiler Design And
Implementation",
with some improvements:
1. Copy 3 times of the loop kernel and construct the corresponding DAG
to identify dependencies between MIs;
2. Use heuristic algorithm to obtain a set of window offsets.
The window algorithm is equivalent to modulo scheduling algorithm with a
stage of 2. It is mainly applied in targets where hardware resource
conflicts are severe, and the SMS algorithm often fails in such cases.
On our own DSA, this window algorithm typically can achieve a
performance
improvement of over 10%.
Co-authored-by: Kai Yan <aklkaiyan@tencent.com>
Co-authored-by: Ran Xiao <lennyxiao@tencent.com>
---------
Co-authored-by: Kai Yan <aklkaiyan@tencent.com>
Co-authored-by: Ran Xiao <lennyxiao@tencent.com>