1248 Commits

Author SHA1 Message Date
Fangrui Song
d4b6fcb32e [Analysis] llvm::Optional => std::optional 2022-12-14 07:32:24 +00:00
Vasileios Porpodas
06911ba6ea [NFC] Cleanup: Replaces BB->getInstList().insert() with I->insertAt().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138877
2022-12-12 13:33:05 -08:00
Roman Lebedev
1bd0b82e50
[SimplifyCFG] FoldBranchToCommonDest(): deal with mismatched IV's in PHI's in common successor block
This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37

Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb

Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts

Differential Revision: https://reviews.llvm.org/D139275
2022-12-12 18:20:03 +03:00
Fangrui Song
c178ed33bd Transforms/Utils: llvm::Optional => std::optional 2022-12-12 08:29:05 +00:00
Dmitry Makogon
b134119137 [SimplifyCFG] Prohibit hoisting of llvm.deoptimize calls
This prohibits hoisiting identical llvm.deoptimize calls
from 'then' and 'else' blocks of a conditional branch.
This fixes a crash that happened because we didn't hoist
the return instructions together with the llvm.deoptimize calls,
so the verifier would crash.

Differential Revision: https://reviews.llvm.org/D139437
2022-12-09 17:44:32 +07:00
Roman Lebedev
92bf8f3f79
[NFC][SimplifyCFG] shouldFoldCondBranchesToCommonDestination(): report the common succ 2022-12-04 20:58:54 +03:00
Kazu Hirata
343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Vasileios Porpodas
bebca2b6d5 [NFC] Cleanup: Replaces BB->getInstList().splice() with BB->splice().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138979
2022-12-01 15:37:51 -08:00
Kazu Hirata
2d1b093a37 [Utils] Use std::optional in SimplifyCFG.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 17:55:33 -08:00
OCHyams
86464ed3df [Assignment Tracking][15/*] Account for assignment tracking in simplifycfg
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Update simplifycfg:
sinkLastInstruction - preserve debug use-before-defs.

SpeculativelyExecuteBB - replace the value component of dbg.assign intrinsics
when stores are hoisted and merged using a select, and don't delete them.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133310
2022-11-18 10:15:55 +00:00
Nikita Popov
01ec0ff2dc [SimplifyCFG] Allow speculating block containing assume()
SpeculativelyExecuteBB(), which converts a branch + phi structure
into a select, currently bails out if the block contains an assume
(because it is not speculatable).

Adjust the fold to ignore ephemeral values (i.e. assumes and values
only used in assumes) for cost modelling purposes, and drop them
when performing the fold.

Theoretically, we could try to preserve the assume information by
generating a assume(br_cond || assume_cond) style assume, but this
is very unlikely to to be useful (because we don't do anything
useful with assumes of this form) and it would make things
substantially more complicated once we take operand bundle assumes
into account (which don't really support a || operation).
I'd prefer not to do that without good motivation.

Differential Revision: https://reviews.llvm.org/D137339
2022-11-04 09:26:35 +01:00
Nikita Popov
b03f7c3365 [SimplifyCFG] Use range based for loop (NFC) 2022-11-03 15:21:45 +01:00
Nikita Popov
592a96c03b [SimplifyCFG] Extract code for tracking ephemeral values (NFC)
To allow reusing this in more places in SimplifyCFG.
2022-11-03 14:28:12 +01:00
Yaxun (Sam) Liu
9d5adc7e49 Revert "reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost"
This reverts commit bd7949bcd86633bd4203b2ba6f891aea00fce4d1.

Revert this patch since reviwers have different opinions regarding
the approach in post-commit review.

Will open RFC for further discussion.

Differential Revision: https://reviews.llvm.org/D132408
2022-10-25 12:15:39 -04:00
Yaxun (Sam) Liu
bd7949bcd8 reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost
Fixed compile time increase due to always constructing LocalCostTracker.
Now only construct LocalCostTracker when needed.
2022-10-24 15:43:53 -04:00
Nikita Popov
dd61726d5b Revert "[SimplifyCFG] accumulate bonus insts cost"
This reverts commit e5581df60a35fffb0c69589777e4e126c849405f.

This causes major compile-time regressions, about 2-3% end-to-end
on CTMark.
2022-09-19 14:46:43 +02:00
Yaxun (Sam) Liu
e5581df60a [SimplifyCFG] accumulate bonus insts cost
SimplifyCFG folds

bool foo() {
  if (cond1) return false;
  if (cond2) return false;
  return true;
}

as

bool foo() {
  if (cond1 | cond2) return false
  return true;
}

'cond2' is called 'bonus insts' in branch folding since they introduce overhead
since the original CFG could do early exit but the folded CFG always executes
them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into
its predecessor BB which shares the destination. If it is below bonus-inst-threshold,
SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed.

When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts
in the current BB to be considered for folding. This causes issue for unrolled loops
which share destinations, e.g.

bool foo(int *a) {
  for (int i = 0; i < 32; i++)
    if (a[i] > 0) return false;
  return true;
}

After unrolling, it becomes

bool foo(int *a) {
  if(a[0]>0) return false
  if(a[1]>0) return false;
  //...
  if(a[31]>0) return false;
  return true;
}

SimplifyCFG will merge each BB with its predecessor BB,
and ends up with 32 'bonus insts' which are always executed, which
is much slower than the original CFG.

The root cause is that SimplifyCFG does not consider the
accumulated cost of 'bonus insts' which are folded from
different BB's.

This patch fixes that by introducing a ValueMap to track
costs of 'bonus insts' coming from different BB's into
the same BB, and cuts off if the accumulated cost
exceeds a threshold.

Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D132408
2022-09-18 20:21:14 -04:00
Arthur Eubanks
5a33d1f0b9 [SimplifyCFG] Don't hoist allocas
D129370 started hoisting allocas across stacksave/stackrestore
boundaries which is wrong.

Reviewed By: chill, rnk

Differential Revision: https://reviews.llvm.org/D133730
2022-09-13 09:23:39 -07:00
Momchil Velikov
078899cd64 [SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions
SimplifyCFG does some common code hoisting, which is limited
to hoisting a sequence of identical instruction in identical
order and stops at the first non-identical instruction.

This patch allows hoisting instruction pairs over
same-length sequences of non-matching instructions. The
linear asymptotic complexity of the algorithm stays the
same, there's an extra parameter
`simplifycfg-hoist-common-skip-limit` serving to limit
compilation time and/or the size of the hoisted live ranges.

The patch improves SPECv6/525.x264_r by about 10%.

Reviewed By: nikic, dmgreen

Differential Revision: https://reviews.llvm.org/D129370
2022-09-05 15:13:46 +01:00
Kazu Hirata
7d8c2d17eb [llvm] Use range-based for loops (NFC)
Identified with modernize-loop-convert.
2022-09-03 23:27:25 -07:00
Kazu Hirata
56ea4f9bd3 [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-27 21:21:02 -07:00
Dmitry Makogon
9142f67ef2 [SimplifyCFG] Don't widen cond br if false branch has successors
Fixes https://github.com/llvm/llvm-project/issues/57221.

This limits the tryWidenCondBranchToCondBranch transform making it
work only if the false block of widenable condition branch
has no successors.

If that block has successors, then SimplifyCondBranchToCondBranch
may undo the transform done by tryWidenCondBranchToCondBranch, which
would lead to infinite cycle of transformation and eventually
an assert failing.

Differential Revision: https://reviews.llvm.org/D132356
2022-08-26 15:23:37 +07:00
Simon Pilgrim
fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Jameson Nash
3a8d7fe201 [SimplifyCFG] teach simplifycfg not to introduce ptrtoint for NI pointers
SimplifyCFG expects to be able to cast both sides to an int, if either side can be case to an int, but this is not desirable or legal, in general, per D104547.

Spotted in https://github.com/JuliaLang/julia/issues/45702

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128670
2022-08-15 15:11:48 -04:00
Kazu Hirata
50724716cd [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-14 12:51:58 -07:00
Kazu Hirata
a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Paul Kirth
d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Nikita Popov
7314ad7a06 Revert "[SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions"
This reverts commit 7b0f6378e211e881c574748090a86beeab264ab3.

As commented on the review, this patch has a correctness issue
regarding the modelling of memory effects.
2022-08-01 09:20:56 +02:00
Momchil Velikov
7b0f6378e2 [SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions
SimplifyCFG does some common code hoisting, which is limited to hoisting a
sequence of identical instruction in identical order and stops at the first
non-identical instruction.

This patch allows hoisting instruction pairs over same-length sequences of
non-matching instructions. The linear asymptotic complexity of the algorithm
stays the same, there's an extra parameter `simplifycfg-hoist-common-skip-limit`
serving to limit compilation time and/or the size of the hoisted live ranges.

The patch improves SPECv6/525.x264_r by about 10%.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D129370
2022-08-01 07:55:14 +01:00
Paul Kirth
6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth
300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
Nuno Lopes
9df0b254d2 [NFC] Switch a few uses of undef to poison as placeholders for unreachable code 2022-07-23 21:50:11 +01:00
Alexander Shaposhnikov
c916840539 [SimplifyCFG] Improve SwitchToLookupTable optimization
Try to use the original value as an index (in the lookup table)
in more cases (to avoid one subtraction and shorten the dependency chain)
(https://github.com/llvm/llvm-project/issues/56189).

Test plan:
1/ ninja check-all
2/ bootstrapped LLVM + Clang pass tests

Differential revision: https://reviews.llvm.org/D128897
2022-07-13 23:21:45 +00:00
Yuanfang Chen
fcb7d76d65 [coroutine] add nomerge function attribute to llvm.coro.save
It is illegal to merge two `llvm.coro.save` calls unless their
`llvm.coro.suspend` users are also merged. Marks it "nomerge" for
the moment.

This reverts D129025.

Alternative to D129025, which affects other token type users like WinEH.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D129530
2022-07-12 10:39:38 -07:00
Nikita Popov
40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Nikita Popov
20962c1240 [SimplifyCFG] Don't split predecessors of callbr terminator
This addresses the assertion failure reported in
https://reviews.llvm.org/D124159#3631240.

I believe that this limitation in SplitBlockPredecessors is not
actually necessary (because unlike with indirectbr, callbr is
restricted in a way that does allow updating successors), but for
now fix the assertion failure the same way we do everywhere else,
by also skipping callbr.
2022-07-06 15:38:53 +02:00
Nikita Popov
f96cb66d19 [ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC)
As constant expressions can no longer trap, it only makes sense to
call isSafeToSpeculativelyExecute on Instructions, so limit the
API to accept only them, rather than general Operators or Values.
2022-07-06 11:12:49 +02:00
Nikita Popov
8ee913d83b [IR] Remove Constant::canTrap() (NFC)
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
2022-07-06 10:36:47 +02:00
Yuanfang Chen
b170d856a3 [SimplifyCFG] Skip hoisting common instructions that return token type
By LangRef, hoisting token-returning instructions obsures the origin
so it should be skipped. Found this issue while investigating a
CoroSplit pass crash.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129025
2022-07-05 11:21:57 -07:00
Nikita Popov
a4772cbaf0 Revert "[SimplifyCFG] Thread branches on same condition in more cases (PR54980)"
This reverts commit 4e545bdb355a470d601e9bb7f7b2693c99e61a3e.

The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.

The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.

So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.

Fixes https://github.com/llvm/llvm-project/issues/56203.
2022-07-05 16:57:46 +02:00
Nikita Popov
dc969061c6 [SimplifyCFG] Thread all predecessors with same value at once
If there are multiple predecessors that have the same condition
value (and thus same "real destination"), these were previously
handled by copying the threaded block for each predecessor.
Instead, we can reuse one block for all of them. This makes the
behavior of SimplifyCFG's jump threading match that of the
actual JumpThreading pass.

This also avoids the infinite combine loop reported in:
https://reviews.llvm.org/D124159#3624387
2022-07-05 14:33:53 +02:00
Nikita Popov
9604601c93 [SimplifyCFG] Remove redundant checks for hoisting (NFCI)
These conditions are later checked in the HoistTerminator code
path. Checking them here is somewhat confusing, because this code
only checks the first instruction in the block, which is not
necessarily the terminator.
2022-07-04 10:53:54 +02:00
Nikita Popov
a6d4b4138f [ConstantFold] Supports compares in ConstantFoldInstOperands()
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).

This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
2022-06-30 11:05:24 +02:00
Nikita Popov
2b089e9ae0 [SimplifyCFG] Try to merge edge block when threading (PR55765)
When threading, we always create a new block for the threaded edge
(even if the edge is not critical), which will later get folded back
into the predecessor if possible. Depending on precise processing
order, this separate block may break the detection of trivial
cycles in the threading code, which normally avoids infinite
threading of loops. Explicitly merge the created edge block into
the predecessor to avoid this.

Fixes https://github.com/llvm/llvm-project/issues/55765.

Differential Revision: https://reviews.llvm.org/D127216
2022-06-20 10:29:33 +02:00
Samuel Eubanks
bf02ed240d Prevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination
TurnSwitchRangeIntoICmp crashes when given a switch with a default
destination of unreachable
Addresses issue #53208
https://github.com/llvm/llvm-project/issues/53208

Differential revision: https://reviews.llvm.org/D127712
2022-06-16 16:11:24 +02:00
Nikita Popov
571c713144 [SimplifyCFG] Handle trapping aggregates (PR49839)
Handle the fact that not only constant expressions, but also
constant aggregates containing expressions can trap.

This still doesn't fix the original C reproducer, probably due to
more issues remaining in other passes.
2022-06-13 14:56:49 +02:00
Hans Wennborg
3800b157d7 [SimplifyCFG] Share code to compute switch density between ShouldBuildLookupTable() and ReduceSwitchRange()
They're computing the same thing. No functionality change.

Differential revision: https://reviews.llvm.org/D127482
2022-06-10 15:29:36 +02:00
Simon Moll
b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Fangrui Song
557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Florian Hahn
a80081763c
[SimplifyCFG] Avoid shifting by a too large exponent.
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.

To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124608
2022-04-29 15:19:06 +01:00