21790 Commits

Author SHA1 Message Date
Fangrui Song
efd94c56ba Use llvm::stable_sort
While touching the code, simplify if feasible.

llvm-svn: 358996
2019-04-23 14:51:27 +00:00
Fedor Sergeev
652168a99b [CallSite removal] move InlineCost to CallBase usage
Converting InlineCost interface and its internals into CallBase usage.
Inliners themselves are still not converted.

Reviewed By: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60636

llvm-svn: 358982
2019-04-23 12:43:27 +00:00
David Green
63a2aa715a [LSR] Limit the recursion for setup cost
In some circumstances we can end up with setup costs that are very complex to
compute, even though the scevs are not very complex to create. This can also
lead to setupcosts that are calculated to be exactly -1, which LSR treats as an
invalid cost. This patch puts a limit on the recursion depth for setup cost to
prevent them taking too long.

Thanks to @reames for the report and test case.

Differential Revision: https://reviews.llvm.org/D60944

llvm-svn: 358958
2019-04-23 08:52:21 +00:00
Philip Reames
d748689c7f [InstCombine] Eliminate stores to constant memory
If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store.

The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead.

Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date.

Differential Revision: https://reviews.llvm.org/D60659

llvm-svn: 358919
2019-04-22 20:28:19 +00:00
Philip Reames
d8d9b7b20e [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify from InstCombine
In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask.  

llvm-svn: 358913
2019-04-22 19:30:01 +00:00
Justin Bogner
e90d5c8db0 [IPSCCP] Add missing AssumptionCacheTracker dependency
Back in August, r340525 introduced a dependency on the assumption
cache tracker in the ipsccp pass, but that commit missed a call to
INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache
improperly registered if SCCP is the only thing that pulls it in.

llvm-svn: 358903
2019-04-22 17:38:29 +00:00
Philip Reames
37104d7189 [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, LoopSimplify)
Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways.  In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available.  So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers.

This patch avoids an invalidation between the two requires in the following trivial pass pipeline:
opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>"
(when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops)

Differential Revision: https://reviews.llvm.org/D60790

llvm-svn: 358901
2019-04-22 17:13:43 +00:00
Nikita Popov
5aacc7a573 Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.

After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.

llvm-svn: 358876
2019-04-22 09:01:38 +00:00
Nikita Popov
5299e25f50 [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().

llvm-svn: 358875
2019-04-22 08:36:05 +00:00
Luqman Aden
2993661cc0 [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.
Summary:
Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative.

Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181)

Reviewers: sanjoy, apilipenko, nikic

Reviewed By: nikic

Subscribers: hiraditya, jfb, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60036

llvm-svn: 358816
2019-04-20 13:14:18 +00:00
Vedant Kumar
282b26ec4d [GVN+LICM] Use line 0 locations for better crash attribution
This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.

Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.

Differential Revision: https://reviews.llvm.org/D60913

llvm-svn: 358791
2019-04-19 22:36:40 +00:00
Eric Christopher
dfebd84eb3 Remove the EnableEarlyCSEMemSSA set of options from the legacy
and new pass managers. They were default to true and not being
used.

Differential Revision: https://reviews.llvm.org/D60747

llvm-svn: 358789
2019-04-19 22:18:53 +00:00
Alina Sbirlea
43709f7233 [LICM & MemorySSA] Make limit flags pass tuning options.
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

llvm-svn: 358772
2019-04-19 17:46:50 +00:00
Alina Sbirlea
0499a2f961 [NewPassManager] Adding pass tuning options: loop vectorize.
Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

llvm-svn: 358763
2019-04-19 16:11:59 +00:00
Fangrui Song
7137b54a03 [MergeFunc] Delete unused FunctionNode::release()
llvm-svn: 358742
2019-04-19 08:03:20 +00:00
Fangrui Song
884f557bb2 [MergeFunc] removeUsers: call remove() only on direct users
removeUsers uses a work list to collect indirect users and call remove()
on those functions. However it has a bug (`if (!Visited.insert(UU).second)`).

Actually, we don't have to collect indirect users.
After the merge of F and G, G's callers will be considered (added to
Deferred). If G's callers can be merged, G's callers' callers will be
considered.

Update the test unnamed-addr-reprocessing.ll to make it clear we can
still merge indirect callers.

llvm-svn: 358741
2019-04-19 07:57:51 +00:00
Chandler Carruth
ce3f75df1f [CallSite removal] Move the legacy PM, call graph, and some inliner
code to `CallBase`.

This patch focuses on the legacy PM, call graph, and some of inliner and legacy
passes interacting with those APIs from `CallSite` to the new `CallBase` class.
No interesting changes.

Differential Revision: https://reviews.llvm.org/D60412

llvm-svn: 358739
2019-04-19 05:59:42 +00:00
Fangrui Song
82216048e6 [MergeFunc] Use less_first() as the comparator of Schwartzian transform
llvm-svn: 358738
2019-04-19 05:49:29 +00:00
Saleem Abdulrasool
b96d9b3419 MergeFunc: preserve COMDAT information when creating a thunk
We would previously drop the COMDAT on the thunk we generated when replacing a
function body with the forwarding thunk. This would result in a function that
may have been multiply emitted and multiply merged to be emitted with the same
name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is
used for the deduplication of Value Witness functions for Swift.

llvm-svn: 358728
2019-04-19 01:48:36 +00:00
Alina Sbirlea
da0f71af7d [LoopUnroll] Move list of params into a struct [NFCI].
Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60638

llvm-svn: 358723
2019-04-18 23:43:49 +00:00
Philip Reames
137995d8da [GuardWidening] Wire up a NPM version of the LoopGuardWidening pass
llvm-svn: 358704
2019-04-18 19:17:14 +00:00
Quentin Colombet
ea3364bf85 [BlockExtractor] Extend the file format to support the grouping of basic blocks
Prior to this patch, each basic block listed in the extrack-blocks-file
would be extracted to a different function.

This patch adds the support for comma separated list of basic blocks
to form group.

When the region formed by a group is not extractable, e.g., not single
entry, all the blocks of that group are left untouched.

Let us see this new format in action (comments are not part of the
file format):
;; funcName bbName[,bbName...]
   foo      bb1        ;; Extract bb1 in its own function
   foo      bb2,bb3    ;; Extract bb2,bb3 in their own function
   bar      bb1,bb4    ;; Extract bb1,bb4 in their own function
   bar      bb2        ;; Extract bb2 in its own function

Assuming all regions are extractable, this will create one function and
thus one call per region.

Differential Revision: https://reviews.llvm.org/D60746

llvm-svn: 358701
2019-04-18 18:28:30 +00:00
Philip Reames
adf288c5d9 [LoopPred] Fix a blatantly obvious bug in r358684
The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant.  I don't know how this got missed in the patch and review.  I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases.  Oops.

llvm-svn: 358688
2019-04-18 17:01:19 +00:00
Philip Reames
92a7177e6b [LoopPredication] Allow predication of loop invariant computations (within the loop)
The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i'
A = _a.length;
guard (i < A)
a = _a[i]
B = _b.length;
guard (i < B);
b = _b[i];
...
Z = _z.length;
guard (i < Z)
z = _z[i]
accum += a + b + ... + z;

Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form.

Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later.

As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner.  See the udiv test changes as an example.  If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before.

A couple of subtleties in the implementation:
- SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point).  It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point.
- SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located.  (i.e. it can be in the loop)  This implies we have a speculation burden to prove before expanding them outside loops.
- invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance.  I plan to sink this part into SCEV once this has baked for a bit.

Differential Revision: https://reviews.llvm.org/D60093

llvm-svn: 358684
2019-04-18 16:33:17 +00:00
Eric Christopher
eff3b6fe7f Elaborate why we have an option on by default for enabling chr.
llvm-svn: 358641
2019-04-18 06:17:40 +00:00
Richard Trieu
7b6192025e Fix bad compare function over FusionCandidate.
Reverse the checking of the domiance order so that when a self compare happens,
it returns false.  This makes compare function have strict weak ordering.

llvm-svn: 358636
2019-04-18 01:39:45 +00:00
Akira Hatanaka
0b19f5aef9 Fix formatting. NFC
llvm-svn: 358623
2019-04-17 23:14:39 +00:00
Denis Bakhvalov
cfd25a4b0e Test commit by Denis Bakhvalov
Change-Id: I4d85123a157d957434902fb14ba50926b2d56212
llvm-svn: 358619
2019-04-17 22:27:30 +00:00
Kit Barton
3cdf87940f Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

llvm-svn: 358607
2019-04-17 18:53:27 +00:00
Philip Reames
88679717ce [InstCombine] Factor out unreachable inst idiom creation [NFC]
In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable.  We can't actually insert the unreachable instruction since that would require changing the CFG.  We leave that to simplifycfg later.

This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places.

llvm-svn: 358600
2019-04-17 17:37:58 +00:00
Florian Hahn
893aea58ea [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.
Summary:
In the following cases, unrolling can be beneficial, even when
optimizing for code size:
 1) very low trip counts
 2) potential to constant fold most instructions after fully unrolling.

We can unroll in those cases, by setting the unrolling threshold to the
loop size. This might highlight some cost modeling issues and fixing
them will have a positive impact in general.

Reviewers: vsk, efriedma, dmgreen, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D60265

llvm-svn: 358586
2019-04-17 15:57:43 +00:00
Roman Lebedev
0080645846 [CVP] processOverflowIntrinsic(): don't crash if constant-holding happened
As reported by Mikael Holmén in post-commit review in
https://reviews.llvm.org/D60791#1469765

llvm-svn: 358559
2019-04-17 06:35:07 +00:00
Eric Christopher
e29874eaa0 Revert "Add basic loop fusion pass." Per request.
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358553
2019-04-17 04:55:24 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
0ebbf72a63 Remove the run-slp-after-loop-vectorization option.
It's been on by default for 4 years and cleans up the pass
hierarchy.

llvm-svn: 358548
2019-04-17 02:26:27 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Kit Barton
ab70da0728 Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851
llvm-svn: 358543
2019-04-17 01:37:00 +00:00
Ali Tamur
e8de5cd602 Fix a typo in comments. [NFC]
llvm-svn: 358531
2019-04-16 21:37:43 +00:00
Sanjay Patel
e08783e2f5 [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101)
This is 1 of the problems discussed in the post-commit thread for:
rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html
and filed as:
https://bugs.llvm.org/show_bug.cgi?id=41101

Instcombine tries to canonicalize some of these cases (and there's room for improvement
there independently of this patch), but it can't always do that because of extra uses.
So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar
to how we detect commuted compares and commuted min/max/abs.

Differential Revision: https://reviews.llvm.org/D60723

llvm-svn: 358523
2019-04-16 20:41:20 +00:00
Nikita Popov
52b24ee932 [CVP] Simplify umulo and smulo that cannot overflow
If a umul.with.overflow or smul.with.overflow operation cannot
overflow, simplify it to a simple mul nuw / mul nsw. After the
refactoring in D60668 this is just a matter of removing an
explicit check against multiplications.

Differential Revision: https://reviews.llvm.org/D60791

llvm-svn: 358521
2019-04-16 20:31:41 +00:00
Simon Pilgrim
82ffa88a04 [SLP] Refactoring of the operand reordering code.
This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold:
i. Cleanup and simplify the reordering code, and
ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2.

This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo .

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D59973

llvm-svn: 358519
2019-04-16 19:27:00 +00:00
Nikita Popov
5ecd6a48b9 [InstCombine] Prune fshl/fshr with masked operands
If a constant shift amount is used, then only some of the LHS/RHS
operand bits are demanded and we may be able to simplify based on
that. InstCombineSimplifyDemanded already had the necessary support
for that, we just weren't calling it with fshl/fshr as root.

In particular, this allows us to relax some masked funnel shifts
into simple shifts, as shown in the tests.

Patch by Shawn Landden.

Differential Revision: https://reviews.llvm.org/D60660

llvm-svn: 358515
2019-04-16 19:05:49 +00:00
Nikita Popov
79dffc67b5 [IR] Add WithOverflowInst class
This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.

The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.

Differential Revision: https://reviews.llvm.org/D60668

llvm-svn: 358512
2019-04-16 18:55:16 +00:00
Hans Wennborg
21eb771dcb Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)
The original commit caused false positives from AddressSanitizer's
use-after-scope checks, which have now been fixed in r358478.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 358483
2019-04-16 12:13:25 +00:00
Hans Wennborg
6ae05777b8 Asan use-after-scope: don't poison allocas if there were untraced lifetime intrinsics in the function (PR41481)
If there are any intrinsics that cannot be traced back to an alloca, we
might have missed the start of a variable's scope, leading to false
error reports if the variable is poisoned at function entry. Instead, if
there are some intrinsics that can't be traced, fail safe and don't
poison the variables in that function.

Differential revision: https://reviews.llvm.org/D60686

llvm-svn: 358478
2019-04-16 07:54:20 +00:00
Quentin Colombet
474a9679bd [CodeExtractor] Add a few debug lines to understand why a region is not extracted
The CodeExtractor is not smart enough to compute which basic block is
the entry of a region. Instead it relies on the order of the list
of basic blocks that is handed to it and assumes that the entry
is the first block in the list.

Without the additional debug information, it is hard to understand
why a valid region does not get extracted, because we would miss
that the order of in the list just doesn't match what the CodeExtractor
wants.

NFC

llvm-svn: 358471
2019-04-16 02:12:05 +00:00
Quentin Colombet
fda0426888 [LSR] Rewrite misses some fixup locations if it splits critical edge
If LSR split critical edge during rewriting phi operands and
phi node has other pending fixup operands, we need to
update those pending fixups. Otherwise formulae will not be
implemented completely and some instructions will not be eliminated.

llvm.org/PR41445

Differential Revision: https://reviews.llvm.org/D60645

Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com>

llvm-svn: 358457
2019-04-15 22:23:46 +00:00
Philip Reames
e46d77d1d9 [LoopPred] Stop passing around builders [NFC]
This is a preparatory patch for D60093. This patch itself is NFC, but while preparing this I noticed and committed a small hoisting change in rL358419.

The basic structure of the new scheme is that we pass around the guard ("the using instruction"), and select an optimal insert point by examining operands at each construction point. This seems conceptually a bit cleaner to start with as it isolates the knowledge about insertion safety at the actual insertion point.

Note that the non-hoisting path is not actually used at the moment. That's not exercised until D60093 is rebased on this one.

Differential Revision: https://reviews.llvm.org/D60718

llvm-svn: 358434
2019-04-15 18:15:08 +00:00
Wolfgang Pieb
4fe42214e2 [DEBUGINFO] Prevent Instcombine from dropping debuginfo when removing zexts
Zexts can be treated like no-op casts when it comes to assessing whether their
removal affects debug info.

Reviewer: aprantl

Differential Revision: https://reviews.llvm.org/D60641

llvm-svn: 358431
2019-04-15 17:36:29 +00:00
Hiroshi Yamauchi
09e539fcae [PGO] Profile guided code size optimization.
Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html 

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422
2019-04-15 16:49:00 +00:00