6 Commits

Author SHA1 Message Date
Usman Nadeem
c9325f8a2e
[DFAJumpThreading] Add an early exit heuristic for unpredictable values (#85015)
Right now the algorithm does not exit on unpredictable values. It
waits until all the paths have been enumerated to see if any of
those paths have that value. Waiting this late leads to a lot of
wasteful computation and higher compile time.

In this patch I have added a heuristic that checks if the value
comes from the same inner loops as the switch, if so, then it is
likely that the value will also be seen on a threadable path and
the code in `getStateDefMap()` return an empty map.

I tested this on the llvm test suite and the only change in the
number of threaded switches was in 7zip (before 23, after 18).
In all of those cases the current algorithm was partially threading
the loop because it was hitting a limit on the number of paths to
be explored. On increasing this limit even the current algorithm
finds paths where the unpredictable value is seen.

Compile time(with pass enabled by default and this patch):

https://llvm-compile-time-tracker.com/compare.php?from=8c5e9cf737138aba22a4a8f64ef2c5efc80dd7f9&to=42c75d888058b35c6d15901b34e36251d8f766b9&stat=instructions:u
2024-03-16 11:24:42 -07:00
XChy
c880fdc0f0
[DFAJumpThreading] Remove incoming StartBlock from all phis when unfolding select (#71082)
Fixes #65222.
When unfolding select into diamond-like control flow, we need to remove
the StartBlock from all phis in EndBlock.
2023-11-04 03:32:20 +08:00
XChy
7fa41d8a8f
[DFAJumpThreading] Only unfold select coming from directly where it is defined (#70966)
Fixes #64860.
When a select instruction comes in by PHINode, the phi's incoming block
for it can flow indirectly past other BasicBlock into it. In this case,
we cannot unfold select to the phi's BB.
2023-11-02 21:25:54 +08:00
Roman Lebedev
641a684fa0
[NFC] Port all DFAJumpThreading tests to -passes= syntax 2022-12-08 02:38:41 +03:00
Nuno Lopes
53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Alexey Zhikhartsev
02077da7e7 Add jump-threading optimization for deterministic finite automata
The current JumpThreading pass does not jump thread loops since it can
result in irreducible control flow that harms other optimizations. This
prevents switch statements inside a loop from being optimized to use
unconditional branches.

This code pattern occurs in the core_state_transition function of
Coremark. The state machine can be implemented manually with goto
statements resulting in a large runtime improvement, and this transform
makes the switch implementation match the goto version in performance.

This patch specifically targets switch statements inside a loop that
have the opportunity to be threaded. Once it identifies an opportunity,
it creates new paths that branch directly to the correct code block.
For example, the left CFG could be transformed to the right CFG:

```
          sw.bb                        sw.bb
        /   |   \                    /   |   \
   case1  case2  case3          case1  case2  case3
        \   |   /                /       |       \
        latch.bb             latch.2  latch.3  latch.1
         br sw.bb              /         |         \
                           sw.bb.2     sw.bb.3     sw.bb.1
                            br case2    br case3    br case1
```

Co-author: Justin Kreiner @jkreiner
Co-author: Ehsan Amiri @amehsan

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D99205
2021-07-27 14:34:04 -04:00