Unused loop invariant loads were not sunk from the preheader to the exit
block, increasing live range.
This commit moves the sinkUnusedInvariant logic from indvarsimplify to
LICM also adds functionality to sink unused load that's not
clobbered by the loop body.
While sinking instructions (that are loop invariant) from preheader to
the exit block, we are skipping instructions due to decrementing
instruction iterator twice.
When computing the BECount for multi-exit loops, we need to combine
individual exit counts using umin_seq rather than umin. This is
because an earlier exit may exit on the first iteration, in which
case later exit expressions will not be evaluated and could be
poisonous. We cannot propagate potential poison values from later
exits.
In particular, this avoids the introduction of "branch on poison"
UB when optimizing multi-exit loops.
Differential Revision: https://reviews.llvm.org/D124910
These intrinsics, not the icmp+select are the canonical form nowadays,
so we might as well directly emit them.
This should not cause any regressions, but if it does,
then then they would needed to be fixed regardless.
Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`,
but that is a pessimization, not a correctness issue.
Additionally, the non-intrinsic form has issues with undef,
see https://reviews.llvm.org/D88287#2587863
Same change as 0dda6333175c1749f12be660456ecedade3bcf21, but for
mul expressions. We want to first fold any constant operans and
then strengthen the nowrap flags, as we can compute more precise
flags at that point.