llvm-project

Author	SHA1	Message	Date
Alexandros Lamprineas	e15d72adac	[FuncSpec] Adjust the names of specializations and promoted stack values Currently the naming scheme is a bit funky; the specializations are named after the original function followed by an arbitrary decimal number. This makes it hard to debug inlined specializations of recursive functions. With this patch I am adding ".specialized." in between of the original name and the suffix, which is now a single increment counter.	2023-09-19 14:40:31 +01:00
Alexandros Lamprineas	9627bcdeac	[FuncSpec][NFC] Command line option renaming. Standardize all options with 'funcspec' prefix and shorter abreviations. Differential Revision: https://reviews.llvm.org/D145378	2023-03-15 18:03:44 +00:00
Tiwari Abhinav Ashok Kumar	bfb1559fbe	[NFC] Fix missing colon in CHECK directives Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D144412	2023-02-21 00:13:04 +05:30
Alexandros Lamprineas	f952bc05fd	[IPSCCP] Create a Pass parameter to control specialization of functions. Required for D140210 in order to disable FuncSpec at {Os, Oz} optimization levels. Differential Revision: https://reviews.llvm.org/D140564	2022-12-23 16:54:45 +00:00
Momchil Velikov	e6b9fc4c8b	[FuncSpec] Global ranking of specialisations The `FunctionSpecialization` pass chooses specializations among the opportunities presented by a single function and its calls, progressively penalizing subsequent specialization attempts by artificially increasing the cost of a specialization, depending on how many specialization were applied before. Thus the chosen specializations are sensitive to the order the functions appear in the module and may be worse than others, had those others been considered earlier. This patch makes the `FunctionSpecialization` pass rank the specializations globally, i.e. choose the "best" specializations among the all possible specializations in the module, for all functions. Since this involved quite a bit of redesign of the pass data structures, this patch also carries: * removal of duplicate specializations * optimization of call sites update, by collecting per specialization the list of call sites that can be directly rewritten, without prior expensive check if the call constants and their positions match those of the specialized function. A bit of a write-up up about the FuncSpec data structures and operation: Each potential function specialisation is kept in a single vector (`AllSpecs` in `FunctionSpecializer::run`). This vector is populated by `FunctionSpecializer::findSpecializations`. The `findSpecializations` member function has a local `DenseMap` to eliminate duplicates - with each call to the current function, `findSpecializations` builds a specialisation signature (`SpecSig`) and looks it in the duplicates map. If the signature is present, the function records the call to rewrite into the existing specialisation instance. If the signature is absent, it means we have a new specialisation instance - the function calculates the gain and creates a new entry in `AllSpecs`. Negative gain specialisation are ignored at this point, unless forced. The potential specialisations for a function form a contiguous range in the `AllSpecs` [1]. This range is recorded in `SpecMap SM`, so we can quickly find all specialisations for a function. Once we have all the potential specialisations with their gains we need to choose the best ones, which fit in the module specialisation budget. This is done by using a max-heap (`std::make_heap`, `std::push_heap`, etc) to find the best `NSpec` specialisations with a single traversal of the `AllSpecs` vector. The heap itself is contained with a small vector (`BestSpecs`) of indices into `AllSpecs`, since elements of `AllSpecs` are a bit too heavy to shuffle around. Next the chosen specialisation are performed, that is, functions cloned, `SCCPSolver` primed, and known call sites updated. Then we run the `SCCPSolver` to propagate constants in the cloned functions, after which we walk the calls of the original functions to update them to call the specialised functions. --- [1] This range may contain specialisation that were discarded and is not ordered in any way. One alternative design is to keep a vector indices of all specialisations for this function (which would initially be, `i`, `i+1`, `i+2`, etc) and later sort them by gain, pushing non-applied ones to the back. This has the potential to speed `updateCallSites` up. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D139346 Change-Id: I708851eb38f07c42066637085b833ca91b195998	2022-12-14 15:30:32 +00:00
Alexandros Lamprineas	8136a0172b	[FuncSpec] Make the Function Specializer part of the IPSCCP pass. Reland 877a9f9abec61f06e39f1cd872e37b828139c2d1 since D138654 (parent) has been fixed with 9ebaf4fef4aac89d4eff08e48185d61bc893f14e and with 8f1e11c5a7d70f96943a72649daa69f152d73e90. Differential Revision: https://reviews.llvm.org/D126455	2022-12-10 14:39:49 +00:00
Alexandros Lamprineas	0f0cb92cb2	Revert "[FuncSpec] Make the Function Specializer part of the IPSCCP pass." This reverts commit 877a9f9abec61f06e39f1cd872e37b828139c2d1. It depends on the parent revision 42c2dc401742266da3e0251b6c1ca491f4779963 which needs to be reverted as it broke some buildbots, so reverting both.	2022-12-08 12:41:43 +00:00
Alexandros Lamprineas	877a9f9abe	[FuncSpec] Make the Function Specializer part of the IPSCCP pass. The aim of this patch is to minimize the compilation time overhead of running Function Specialization. It is about 40% slower to run as a standalone pass (IPSCCP + FuncSpec vs IPSCCP with FuncSpec) according to my measurements. I compiled the llvm testsuite with NewPM-O3 + LTO and measured single threaded [user + system] time of IPSCCP and FuncSpec by passing the '-time-passes' option to lld. Then I compared the two configurations in terms of Instruction Count of the total compilation (not of the individual passes) as in https://llvm-compile-time-tracker.com. Geomean for non-LTO builds is -0.25% and LTO is -0.5% approximately. You can find more info below: https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518 Differential Revision: https://reviews.llvm.org/D126455	2022-12-08 12:14:27 +00:00
Momchil Velikov	606d25e545	[FuncSpec] Compute specialisation gain even when forcing specialisation When rewriting the call sites to call the new specialised functions, a single call site can be matched by two different specialisations - a "less specialised" version of the function and a "more specialised" version of the function, e.g. for a function void f(int x, int y) the call like `f(1, 2)` could be matched by either void f.1(int x /* int y == 2 /); or void f.2(/ int x == 1, int y == 2 */); The `FunctionSpecialisation` pass tries to match specialisation in the order of decreasing gain, so "more specialised" functions are preferred to "less specialised" functions. This breaks, however, when using the flag `-force-function-specialization`, in which case the cost/benefit analysis is not performed and all the specialisations are equally preferable. This patch makes the pass calculate specialisation gain and order the specialisations accordingly even when `-force-function-specialization` is used, under the assumption that this flag has purely debugging purpose and it is reasonable to ignore the extra computing effort it incurs. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D136180	2022-10-26 10:08:03 +01:00

9 Commits