Currently, SROA is CFG-preserving.
Not doing so does not affect any pipeline test. (???)
Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call.
By design, we can't really SROA alloca if their address escapes somehow,
but we have logic to deal with `load` of `select`/`PHI`,
where at least one of the possible addresses prevents promotion,
by speculating the `load`s and `select`ing between loaded values.
As one would expect, that requires ensuring that the speculation is actually legal.
Even ignoring complexity bailouts, that logic does not deal with everything,
e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`.
There can also be cases where the load is genuinely non-speculate.
So if we can't prove that the load can be speculated,
unfold the select, produce two-entry phi node, and perform predicated load.
Now, that transformation must obviously update Dominator Tree,
since we require it later on. Doing so is trivial.
Additionally, we don't want to do this for the final SROA invocation (D136806).
In the end, this ends up having negative (!) compile-time cost:
https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u
Though indeed, this only deals with `select`s, `PHI`s are still using speculation.
Should we update some more analysis?
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138238
This reverts commit 739611870d3b06605afe25cc07833f6a62de9545,
and recommits 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5
with a fixed assertion - we should check that DTU is there,
not just assert false...
The assertion about not modifying the CFG seems to not hold,
will recommit in a bit.
https://lab.llvm.org/buildbot#builders/139/builds/32412
This reverts commit 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5.
This reverts commit 4f90f4ada33718f9025d0870a4fe3fe88276b3da.
Currently, SROA is CFG-preserving.
Not doing so does not affect any pipeline test. (???)
Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call.
By design, we can't really SROA alloca if their address escapes somehow,
but we have logic to deal with `load` of `select`/`PHI`,
where at least one of the possible addresses prevents promotion,
by speculating the `load`s and `select`ing between loaded values.
As one would expect, that requires ensuring that the speculation is actually legal.
Even ignoring complexity bailouts, that logic does not deal with everything,
e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`.
There can also be cases where the load is genuinely non-speculate.
So if we can't prove that the load can be speculated,
unfold the select, produce two-entry phi node, and perform predicated load.
Now, that transformation must obviously update Dominator Tree,
since we require it later on. Doing so is trivial.
Additionally, we don't want to do this for the final SROA invocation (D136806).
In the end, this ends up having negative (!) compile-time cost:
https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u
Though indeed, this only deals with `select`s, `PHI`s are still using speculation.
Should we update some more analysis?
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138238
Tests were updated with this script:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
However, in this case a lot of fixup was required, due to many
minor, but ultimately immaterial differences in results. In
particular, the GEP representation changes slightly in many cases,
either because we now use an i8 GEP, or because we now leave a
GEP alone, using it's original index types and (lack of) inbounds.
basictest-opaque-ptrs.ll has been dropped, because it was an
opaque pointers duplicate of basictest.ll.
In the process of rewriting `alloca`s and `phi`s that use them, the SROA
pass can try to insert a non-PHI instruction by calling
`getFirstInsertionPt()`, which is not possible in a catchswitch BB. This
CL makes we bail out on these cases.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D117168