llvm-project

Author	SHA1	Message	Date
Sanjay Patel	bfb9b8e075	[Passes] add a tail-call-elim pass near the end of the opt pipeline We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later. In the motivating case from issue #47852, the missing 'tail' on memset leads to sub-optimal codegen. I experimented with removing the early instance of tail-call-elim instead of just adding another pass, but that appears to be slightly worse for compile-time: +0.15% vs. +0.08% time. "tailcall" shows adding the pass; "tailcall2" shows moving the pass to later, then adding the original early pass back (so 1596886802 is functionally equivalent to 180b0439dc ): https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright Note that there was an effort to split the tail call functionality into 2 passes - that could help reduce compile-time if we find that this change costs more in compile-time than expected based on the preliminary testing: D60031 Differential Revision: https://reviews.llvm.org/D130374	2022-07-25 15:25:47 -04:00
Nikita Popov	9a9421a461	Reapply [InstCombine] Fold multiuse shr eq zero This was reverted due to performance regressions in ARM benchmarks, which have since been addressed by D101196 (SCEV analysis improvement) and D101778 (CGP reverse transform). ----- The single-use case is handled implicity by converting the icmp into a mask check first. When comparing with zero in particular, we don't need the one-use restriction, as we only produce a single icmp. https://alive2.llvm.org/ce/z/MSixcm https://alive2.llvm.org/ce/z/GwpG0M	2021-05-22 14:46:50 +02:00
Nikita Popov	24e9fbc1a3	Revert "[InstCombine] Fold multiuse shr eq zero" This reverts commit 9423f78240a216e3f38b394a41fe3427dee22c26. A performance regression with this patch has been reported at https://reviews.llvm.org/rG9423f78240a2#990953. Reverting for now.	2021-04-21 21:40:52 +02:00
Nikita Popov	9423f78240	[InstCombine] Fold multiuse shr eq zero The single-use case is handled implicity by converting the icmp into a mask check first. When comparing with zero in particular, we don't need the one-use restriction, as we only produce a single icmp. https://alive2.llvm.org/ce/z/MSixcm https://alive2.llvm.org/ce/z/GwpG0M	2021-04-19 22:13:11 +02:00
Craig Topper	36b5d09b07	[X86] Add phase ordering test for the problem D99427 is trying to solve. NFC	2021-03-28 12:14:30 -07:00

5 Commits