The existing way of creating the predicate in the guard blocks uses
a boolean value per outgoing block. This increases the number of live
booleans as the number of outgoing blocks increases. The new way added
in this change is to store one integer to represent the outgoing block
we want to branch to, then at each guard block, an integer equality
check is performed to decide which a specific outgoing block is taken.
Using an integer reduces the number of live values and decreases
register pressure especially in cases where there are a large number
of outgoing blocks. The integer based approach is used when the
number of outgoing blocks crosses a threshold, which is currently set
to 32.
Patch by Ruiling Song.
Differential review: https://reviews.llvm.org/D127831
In the NPM, a pass cannot depend on another non-analysis pass. So pin
the test that tests that -lowerswitch is run automatically to legacy PM.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D89051
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.
This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D77198
This restores commit 2ada8e2525dd2653f30c8696a27162a3b1647d66.
Originally reverted with commit 44e09b59b869a91bf47d76e8bc569d9ee91ad145.
This reverts commit 2ada8e2525dd2653f30c8696a27162a3b1647d66.
Buildbots produced compilation errors which I was not able to quickly
reproduce locally. Need more time to investigate.
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.
This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D77198