110 Commits

Author SHA1 Message Date
luxufan
25d9fde22e [SCCP] Skip computing intrinsics if one of its args is unknownOrUndef
For constant range supported intrinsics, we got consantrange from args
no matter if they are unknown or undef. And the constant range computed
from unknown or undef value state is full range.

I think compute with full constant range is harmful since although we
can do mergeIn after these args value state are changed, the merge
operation of two constant ranges is union.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152499
2023-06-10 15:48:46 +08:00
Alexandros Lamprineas
475ddca56e Reland "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo"
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.

Differential Revision: https://reviews.llvm.org/D150375
2023-06-08 17:44:47 +01:00
Mikhail Gudim
d37d4072f2 Reapply [SCCP] Constant propagation through freeze instruction
Reapply with extra check for struct types, which caused buildbot
failures last time.

-----

The freeze instruction has not been handled by SCCPInstVisitor.
This patch adds SCCPInstVisitor::visitFreezeInst(FreezeInst &I)
method to handle freeze instructions.

Differential Revision: https://reviews.llvm.org/D151659
2023-06-05 11:47:36 +02:00
Alexandros Lamprineas
b1f41685a6 [IPSCCP] Decouple queries for function analysis results.
The SCCPSolver is using a structure (AnalysisResultsForFn) where it keeps
pointers to various analyses needed by the IPSCCP pass. These analyses are
requested all at the same time, which can become problematic in some cases.
For example one could be retrieved via getCachedAnalysis() prior to the
actual execution of the analysis. In more detail:

The IPSCCP pass uses a DomTreeUpdater to preserve the PostDominatorTree
in case the PostDominatorTreeAnalysis had run before IPSCCP. Starting with
commit 1b1232047e83b the IPSCCP pass may use BlockFrequencyAnalysis for
some functions in the module. As a result, the PostDominatorTreeAnalysis
may not run until the BlockFrequencyAnalysis has run, since the latter
analysis depends on the former. Currently, we setup the DomTreeUpdater
using getCachedAnalysis to retrieve a PostDominatorTree. This happens
before BlockFrequencyAnalysis has run, therefore the cached analysis can
become invalid by the time we use it.

Differential Revision: https://reviews.llvm.org/D151666
2023-06-01 16:38:04 +01:00
Nikita Popov
223f9b096e Revert "[SCCP] Constant propagation through freeze instruction"
This reverts commit 559d47a1790e1a9f9b1f8838a443eb7624ef1ac7.

Caused failure on sanitizer-aarch64-linux-bootstrap-ubsan:

    clang++: /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Transforms/Utils/SCCPSolver.cpp:442: llvm::ValueLatticeElement &llvm::SCCPInstVisitor::getValueState(llvm::Value *): Assertion `!V->getType()->isStructTy() && "Should use getStructValueState"' failed.
2023-06-01 15:30:46 +02:00
Mikhail Gudim
559d47a179 [SCCP] Constant propagation through freeze instruction
The freeze instruction has not been handled by SCCPInstVisitor.
This patch adds SCCPInstVisitor::visitFreezeInst(FreezeInst &I)
method to handle freeze instructions.

Differential Revision: https://reviews.llvm.org/D151659
2023-06-01 15:06:59 +02:00
Nikita Popov
96a14f388b Revert "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo"
As reported on https://reviews.llvm.org/D150375#4367861 and
following, this change causes PDT invalidation issues. Revert
it and dependent commits.

This reverts commit 0524534d5220da5ecb2cd424a46520184d2be366.
This reverts commit ced90d1ff64a89a13479a37a3b17a411a3259f9f.
This reverts commit 9f992cc9350a7f7072a6dbf018ea07142ea7a7ed.
This reverts commit 1b1232047e83b69561fd64b9547cb0a0d374473a.
2023-05-30 14:49:03 +02:00
Alexandros Lamprineas
1b1232047e [FuncSpec] Replace LoopInfo with BlockFrequencyInfo.
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.

Differential Revision: https://reviews.llvm.org/D150375
2023-05-22 17:49:52 +01:00
Alexandros Lamprineas
54e5fb789c [FuncSpec] Track the return values of specializations.
To track the return values of specializations, we need to invalidate all
the lattice values across the use-def chain which originates from the
callsites, recompute and propagate.

Differential Revision: https://reviews.llvm.org/D146158
2023-04-24 13:18:49 +01:00
Alexandros Lamprineas
7ea597ead9 [FuncSpec] Consider constant struct arguments when specializing.
Optionally enabled just like integer and floating point arguments.

Differential Revision: https://reviews.llvm.org/D145374
2023-04-17 18:39:21 +01:00
Florian Hahn
7d10213317
Recommit "[SCCP] Support NUW/NSW inference for all overflowing binary operators."
This reverts commit 43acb61a08fffada31fb2e20e45fcc8492ef76b9.

Recommit the patch after fixing the issue causing the revert in 4e607ec4987.
Extra tests have been added in 5c6cb61ad416a544.

Original commit message:

   Extend the NUW/NSW inference logic add in 72121a20cd and cdeaf5f28c3dc
    to all overflowing binary operators.

    Reviewed By: nikic

    Differential Revision: https://reviews.llvm.org/D142721
2023-01-30 20:15:28 +00:00
Florian Hahn
4e607ec498
[SCCP] Flip range arguments for NSW region check.
This brings the operand order in line with the NUW handling, which was
missed out in 72121a20cda4dc91d0ef5548f930.

At the moment this is NFC as we only additions, but it
should fix miscompiles with 024115ab14822a recommitted.
2023-01-30 18:03:18 +00:00
Florian Hahn
43acb61a08
Revert "[SCCP] Support NUW/NSW inference for all overflowing binary operators."
This reverts commit 024115ab14822a97c09adcd2545c14e78b843b36.

I suspect that this may be causing some buildbot bootstrapping failures.
Revert while I investigate.
2023-01-28 21:33:28 +00:00
Florian Hahn
024115ab14
[SCCP] Support NUW/NSW inference for all overflowing binary operators.
Extend the NUW/NSW inference logic add in 72121a20cd and cdeaf5f28c3dc
to all overflowing binary operators.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142721
2023-01-28 17:40:41 +00:00
Florian Hahn
72121a20cd
[SCCP] Use range info to prove AddInst has NSW flag.
This patch updates SCCP to use the value ranges of AddInst operands to
try to prove the AddInst does not overflow in the signed sense and
adds the NSW flag. The reasoning is done with
makeGuaranteedNoWrapRegion (thanks @nikic for point it out!).

Follow-ups will include extending this to more
OverflowingBinaryOperators.

Depends on D142387.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142390
2023-01-27 14:09:26 +00:00
Florian Hahn
cdeaf5f28c
Recommit "[SCCP] Use range info to prove AddInst has NUW flag."
This reverts commit 531756b9544b3e164b9ab998292fce3ebbc7c746.

The recommitted version fixes a crash when one of the operands is a
constant other than a ConstantInt. Test for that case have been added
in 5b16cd97b8e1c273.

It splits off the new logic into a separate function because setting the
flags is quite different compared to the other cases handled in replaceSignedInst
which all require replacing an existing instruction.

It also guards makeGuaranteedNoWrapRegion by `if (!Inst.hasNoUnsignedWrap())`
as discussed in the review.

Fixes #60280.
Fixes #60278.

Original message:
    This patch updates SCCP to use the value ranges of AddInst operands to
    try to prove the AddInst does not overflow in the unsigned sense and
    adds the NUW flag. The reasoning is done with
    makeGuaranteedNoWrapRegion (thanks @nikic for point it out!).

    Follow-ups will include adding NSW and extension to more
    OverflowingBinaryOperators.

    Reviewed By: nikic

    Differential Revision: https://reviews.llvm.org/D142387
2023-01-26 11:45:10 +00:00
Florian Hahn
531756b954
Revert "Recommit "[SCCP] Use range info to prove AddInst has NUW flag.""
This reverts commit 366e1faa2fffd5b2284e25b09b6a26bcd2aca2b7.

It looks like this exposes another set of crashes on the buildbots.
Revert while I investigate.
2023-01-25 18:22:24 +00:00
Florian Hahn
366e1faa2f
Recommit "[SCCP] Use range info to prove AddInst has NUW flag."
The recommitted version fixes a crash when one of the operands is a
constant other than a ConstantInt. Test for that case have been added
in 5b16cd97b8e1c273.

It splits off the new logic into a separate function because setting the
flags is quite different compared to the other cases handled in replaceSignedInst
which all require replacing an existing instruction. Instructions are
now refined before any replacements are done, which has the advantage
that we should have lattice values for all operands (fixing the crashes
and simplifies the logic) and also allows optimizing more cases where one
of the operands also gets replaced (see improvements in
@sge_with_sext_to_zext_conversion).

It also guards makeGuaranteedNoWrapRegion by `if (!Inst.hasNoUnsignedWrap())`
as discussed in the review.

Fixes #60280.
Fixes #60278.

Original message:
    This patch updates SCCP to use the value ranges of AddInst operands to
    try to prove the AddInst does not overflow in the unsigned sense and
    adds the NUW flag. The reasoning is done with
    makeGuaranteedNoWrapRegion (thanks @nikic for point it out!).

    Follow-ups will include adding NSW and extension to more
    OverflowingBinaryOperators.

    Reviewed By: nikic

    Differential Revision: https://reviews.llvm.org/D142387
2023-01-25 18:07:25 +00:00
Douglas Yung
c9401f2ebe Revert "[SCCP] Use range info to prove AddInst has NUW flag."
This reverts commit de122cb920080fd9e24b2777114271fbef932d5e.

This change causes assertion failures in many of our internal tests.
I have filed #60280 for this issue.
2023-01-24 21:23:03 -08:00
Florian Hahn
de122cb920
[SCCP] Use range info to prove AddInst has NUW flag.
This patch updates SCCP to use the value ranges of AddInst operands to
try to prove the AddInst does not overflow in the unsigned sense and
adds the NUW flag. The reasoning is done with
makeGuaranteedNoWrapRegion (thanks @nikic for point it out!).

Follow-ups will include adding NSW and extension to more
OverflowingBinaryOperators.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142387
2023-01-24 20:53:07 +00:00
Florian Hahn
a794b62c0c
[SCCPSolver] Move helper functions inside SCCPSolver (NFC).
This patch moves a couple of helper functions from the global llvm::
namespace into the SCCPSolver class. This reduces the need for separate
SCCPSolver arguments and also limits the scope of those functions that
have quite generic names.

(The remaining isConstant and isOverdefined should ideally be removed)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142370
2023-01-23 17:41:33 +00:00
serge-sans-paille
38818b60c5
Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955
2023-01-05 14:11:08 +01:00
Matt Arsenault
95570af6fa SCCPSolver: Remove unnecessary set empty check 2022-12-21 09:33:17 -05:00
Fangrui Song
c178ed33bd Transforms/Utils: llvm::Optional => std::optional 2022-12-12 08:29:05 +00:00
Alexandros Lamprineas
8136a0172b [FuncSpec] Make the Function Specializer part of the IPSCCP pass.
Reland 877a9f9abec61f06e39f1cd872e37b828139c2d1 since D138654 (parent)
has been fixed with 9ebaf4fef4aac89d4eff08e48185d61bc893f14e and with
8f1e11c5a7d70f96943a72649daa69f152d73e90.

Differential Revision: https://reviews.llvm.org/D126455
2022-12-10 14:39:49 +00:00
Alexandros Lamprineas
9ebaf4fef4 [IPSCCP] Move the IPSCCP run function under the IPO directory.
Reland 42c2dc401742266da3e0251b6c1ca491f4779963 which was reverted
in cb03b1bd99313a728d47060b909a73e7f5991231. The fix for the link
errors was to reintroduce one of the two occurences of 'Scalar'
under the LINK_COMPONENTS.

Differential Revision: https://reviews.llvm.org/D138654
2022-12-09 15:05:11 +00:00
Alexandros Lamprineas
cb03b1bd99 Revert "[IPSCCP] Move the IPSCCP run function under the IPO directory."
This reverts commit 42c2dc401742266da3e0251b6c1ca491f4779963.

This broke some buildbots:

 undefined reference to `llvm::createBitTrackingDCEPass()'
 undefined reference to `llvm::createAlignmentFromAssumptionsPass()'
 undefined reference to `llvm::createLoopUnrollPass(int, bool, bool, int, int, int, int, int, int)'
 undefined reference to `llvm::createLICMPass(unsigned int, unsigned int, bool)'
 undefined reference to `llvm::createWarnMissedTransformationsPass()'
 undefined reference to `llvm::createAlignmentFromAssumptionsPass()'
 undefined reference to `llvm::createCallSiteSplittingPass()'
 undefined reference to `llvm::createCFGSimplificationPass(llvm::SimplifyCFGOptions, std::function<bool (llvm::Function const&)>)'
 undefined reference to `llvm::createFloat2IntPass()'
 undefined reference to `llvm::createLowerConstantIntrinsicsPass()'
 undefined reference to `llvm::createLoopRotatePass(int, bool)'
 undefined reference to `llvm::createLoopDistributePass()'
 undefined reference to `llvm::createLoopSinkPass()'
 undefined reference to `llvm::createInstSimplifyLegacyPass()'
 undefined reference to `llvm::createDivRemPairsPass()'
 undefined reference to `llvm::createCFGSimplificationPass(llvm::SimplifyCFGOptions, std::function<bool (llvm::Function const&)>)'
 undefined reference to `llvm::SetLicmMssaOptCap'
 undefined reference to `llvm::SetLicmMssaNoAccForPromotionCap'
 undefined reference to `llvm::ForgetSCEVInLoopUnroll'
2022-12-08 12:42:34 +00:00
Alexandros Lamprineas
0f0cb92cb2 Revert "[FuncSpec] Make the Function Specializer part of the IPSCCP pass."
This reverts commit 877a9f9abec61f06e39f1cd872e37b828139c2d1.

It depends on the parent revision 42c2dc401742266da3e0251b6c1ca491f4779963
which needs to be reverted as it broke some buildbots, so reverting both.
2022-12-08 12:41:43 +00:00
Alexandros Lamprineas
877a9f9abe [FuncSpec] Make the Function Specializer part of the IPSCCP pass.
The aim of this patch is to minimize the compilation time overhead of
running Function Specialization. It is about 40% slower to run as a
standalone pass (IPSCCP + FuncSpec vs IPSCCP with FuncSpec) according
to my measurements. I compiled the llvm testsuite with NewPM-O3 + LTO
and measured single threaded [user + system] time of IPSCCP and FuncSpec
by passing the '-time-passes' option to lld. Then I compared the two
configurations in terms of Instruction Count of the total compilation
(not of the individual passes) as in https://llvm-compile-time-tracker.com.
Geomean for non-LTO builds is -0.25% and LTO is -0.5% approximately.

You can find more info below:

https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518

Differential Revision: https://reviews.llvm.org/D126455
2022-12-08 12:14:27 +00:00
Alexandros Lamprineas
42c2dc4017 [IPSCCP] Move the IPSCCP run function under the IPO directory.
The LLVMipo library no longer depends on the Scalar component.
The shared functions between IPSCCP and SCCP have been moved
under Utils, in the SCCPSolver.

This is preliminary work for D126455, in order to break a cyclic
dependency between LLVM libraries.

Differential Revision: https://reviews.llvm.org/D138654
2022-12-08 12:14:27 +00:00
luxufan
b0f7876917 [SCCP] Propagate equality of a not-constant
The equality state of a not-constant can be used to do constant
propagation. For example,
```
define i32 @equal_not_constant(ptr noundef %p, ptr noundef %q) {
entry:
  %cmp = icmp ne ptr %p, null
  br i1 %cmp, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %cmp.then = icmp eq ptr %p, %q
  br i1 %cmp.then, label %if.then1, label %if.end

if.then1:                                         ; preds = %if.then
  %cmp.then1 = icmp ne ptr %q, null
  call void @use(i1 %cmp.then1)
  br label %if.end

if.end:
  ret i32 0
}

```
In this case, we can fold `%cmp.then1` as `true`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D139289
2022-12-05 19:54:13 +08:00
Nikita Popov
f101196d69 [SCCP] Add support for with.overflow intrinsics
This adds SCCP support for extractvalues of with.overflow.
We compute both the range of the result value and determine when
the overflow value is always false.

Differential Revision: https://reviews.llvm.org/D137713
2022-12-05 12:00:22 +01:00
Nikita Popov
ce2f9ba2c9 [SCCP] Add helper for getting constant range (NFC)
Add a helper for the recurring pattern of getting a constant range
if the value lattice element is one, or a full range otherwise.
2022-11-09 12:42:36 +01:00
Nikita Popov
134bda4b61 [ValueLattice] Use DL-aware folding in getCompare()
Use DL-aware ConstantFoldCompareInstOperands() API instead of
ConstantExpr API. The practical effect of this is that SCCP can
now fold comparisons that require DL.
2022-11-02 10:41:11 +01:00
Nikita Popov
7ad3bb527e Reapply [ValueLattice] Fix getCompare() for undef values
Relative to the previous attempt, this also updates the
ValueLattice unit tests.

-----

Resolve the TODO about incorrect getCompare() behavior. This can
be made more precise (e.g. by materializing the undef value and
performing constant folding on it), but for now just return an
unknown result to fix the correctness issue.

This should be NFC in terms of user-visible behavior, because the
only user of this method (SCCP) was already guarding against
UndefValue results.
2022-11-02 10:23:06 +01:00
Nikita Popov
094d1298d1 Revert "[ValueLattice] Fix getCompare() for undef values"
This reverts commit 6de41e6d7652b74a5aa2bd78b0597cc53a8c3c2a.

Missed a unit test affected by this change, revert for now.
2022-11-01 17:03:40 +01:00
Nikita Popov
6de41e6d76 [ValueLattice] Fix getCompare() for undef values
Resolve the TODO about incorrect getCompare() behavior. This can
be made more precise (e.g. by materializing the undef value and
performing constant folding on it), but for now just return an
unknown result.
2022-11-01 16:41:24 +01:00
Momchil Velikov
c9b4dc3a81 [FuncSpec][NFC] Avoid redundant computations of DominatorTree/LoopInfo
The `FunctionSpecialization` pass needs loop analysis results for its
cost function. For this purpose, it computes the `DominatorTree` and
`LoopInfo` for a function in `getSpecializationBonus`.  This function,
however, is called O(number of call sites x number of arguments), but
the DominatorTree/LoopInfo can be computed just once.

This patch plugs into the PassManager infrastructure to obtain
LoopInfo for a function and removes ad-hoc computation from
`getSpecializatioBonus`.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D136332
2022-10-28 16:08:41 +01:00
Kevin P. Neal
cfb88ee3ba [StrictFP][IPSCCP] Constant fold intrinsics with metadata arguments
This teaches the SCCP Solver how to constant fold more intrinsics. Constant
folding appears to be just as good as D115737 but much, much lower in code
change impact as suggested by nikic.

The constrained floating-point intrinsics all take at least one metadata
argument and were the motivation for the change.

Differential Revision: https://reviews.llvm.org/D136466
2022-10-24 11:43:20 -04:00
Nikita Popov
ebc54e0cd4 [SCCP] Make check for unknown/undef in unary op handling more explicit (NFCI)
Make the implementation more similar to other functions, by
explicitly skipping an unknown/undef first, and always falling
back to overdefined at the end. I don't think it makes a difference
now, but could make one once the constant evaluation can fail. In
that case we would directly mark the result as overdefined now,
rather than keeping it unknown (and later making it overdefined
because we think it's undef-based).
2022-07-14 10:56:11 +02:00
Nikita Popov
6db3edc858 [SCCP] Don't check for UndefValue before calling markConstant()
The value lattice explicitly represents undef, and markConstant()
internally checks for UndefValue and will create an undef rather
than constant lattice element in that case.

This is mostly a code simplification, it has little practical impact
because we usually get undef results from undef operands, and those
don't get processed.

Only leave the check behind for the CmpInst case, because it
currently goes through this incorrect code in the getCompare()
implementation: f98697642c/llvm/include/llvm/Analysis/ValueLattice.h (L456-L457)

Differential Revision: https://reviews.llvm.org/D128330
2022-07-14 10:05:56 +02:00
Nikita Popov
07146a9e64 [SCCP] Fix typo in previous commit
Ooops, I tested a build from the wrong checkout.
2022-07-13 16:22:40 +02:00
Nikita Popov
e298dfbc1b [SCCP] Avoid ConstantExpr::get() call
Use ConstantFoldUnaryOpOperand() API instead. This is in
preparation for removing fneg constant expressions.
2022-07-13 16:20:34 +02:00
Nikita Popov
9b994593cc [SCCP] Only handle unknown lattice values in resolvedUndefsIn()
This is a minor refinement of resolvedUndefsIn(), mostly for clarity.
If the value of an instruction is undef, then that's already a legal
final result -- we can safely rauw such an instruction with undef.
We only need to mark unknown values as overdefined, as that's the
result we get for an instruction that has not been processed because
it has an undef operand.

Differential Revision: https://reviews.llvm.org/D128251
2022-07-01 09:14:37 +02:00
Nikita Popov
1f88d80408 [SCCP] Don't mark edges feasible when resolving undefs
As branch on undef is immediate undefined behavior, there is no need
to mark one of the edges as feasible. We can leave all the edges
non-feasible. In IPSCCP, we can replace the branch with an unreachable
terminator.

Differential Revision: https://reviews.llvm.org/D126962
2022-06-22 10:28:27 +02:00
Simon Moll
b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Nikita Popov
7fa97b473c [SCCP] Don't mark ranges from branch conditions as potentially undef
Now that transforms introducing branch on poison have been removed,
we can stop marking ranges that have been derived from branch
conditions as containing undef. The existing comment explains why
this is legal. I've checked that alive2 is happy with SCCP tests
after this change.

Differential Revision: https://reviews.llvm.org/D126647
2022-06-07 10:20:24 +02:00
Alexandros Lamprineas
a910337b5d [FuncSpec] Conditional jump or move depends on uninitialised value(s).
I found this bug when performing a two-stage build of clang with
Function Specialization enabled and tuned aggressively. The crash
appears only on release builds.

Fixes https://github.com/llvm/llvm-project/issues/55000.

Before accessing the contents of the ArgInfo iterator inside
SCCPInstVisitor::markArgInFuncSpecialization, we should be
checking that the iterator is valid.

Differential Revision: https://reviews.llvm.org/D124114
2022-04-27 07:28:25 +01:00
Alexandros Lamprineas
8045bf9d0d [FuncSpec] Support function specialization across multiple arguments.
The current implementation of Function Specialization does not allow
specializing more than one arguments per function call, which is a
limitation I am lifting with this patch.

My main challenge was to choose the most suitable ADT for storing the
specializations. We need an associative container for binding all the
actual arguments of a specialization to the function call. We also
need a consistent iteration order across executions. Lastly we want
to be able to sort the entries by Gain and reject the least profitable
ones.

MapVector fits the bill but not quite; erasing elements is expensive
and using stable_sort messes up the indices to the underlying vector.
I am therefore using the underlying vector directly after calculating
the Gain.

Differential Revision: https://reviews.llvm.org/D119880
2022-03-28 12:01:53 +01:00
Alexandros Lamprineas
910eb988eb [FuncSpec][NFC] Refactor internal structures.
`ArgInfo` is reduced to only contain a pair of {formal,actual} values.
The specialized function `Fn` and the `Partial` flag are redundant in
this structure. The `Gain` is moved to a new struct `SpecializationInfo`.

The value mappings created by cloneCandidateFunction() are being used
by rewriteCallSites() for matching the formal arguments of recursive
functions.

The list of specializations is passed by reference to calculateGains()
instead of being returned by value.

The `IsPartial` flag is removed from isArgumentInteresting() and
getPossibleConstants() as it's no longer used anywhere in the code.

Differential Revision: https://reviews.llvm.org/D120753
2022-03-03 13:08:13 +00:00