This is an attempt to reland D42600 and enabling this optimisation by default.
This also resolves the issue pointed out in the context of PGO build.
Differential Revision: https://reviews.llvm.org/D42600
This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4.
The original commit causes a Chrome build assertion failure with
ThinLTO: https://crbug.com/1443635
This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).
Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change
for others.
Co-authored-by: junbuml
Differential Revision: https://reviews.llvm.org/D42600
IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.
The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.
Differential Revision: https://reviews.llvm.org/D141912
doesn't happened in peephole optimizer.
Summary: Converting a comparison against 1 or -1 into a comparison
against 0 can exploit record-form instructions for comparison optimization.
The conversion will happen only when a record-form instruction can be used
to replace the comparison during the peephole optimizer (see function optimizeCompareInstr).
In post-RA, we also want to optimize the comparison by using the record
form (see D131873) and it requires additional dataflow analysis to reliably
find uses of the CR register set.
It's reasonable to common the conversion for both peephole optimizer and
post-RA optimizer.
Converting to comparison against zero even when the optimization doesn't
happened in peephole optimizer may create additional opportunities for the
post-RA optimization.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D131374
The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct.
This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid.
In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled.
On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE.
The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB.
Differential Revision: https://reviews.llvm.org/D112734
Add a new preparation pattern in PPCLoopInstFormPrep pass to reduce register
pressure.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D108750