14597 Commits

Author SHA1 Message Date
Benjamin Kramer
96f4b12880 [GlobalOpt] Don't look through aliases when sorting names of globals.
If both are different aliases to the same value the sorting becomes
non-deterministic as array_pod_sort is not stable.

llvm-svn: 263550
2016-03-15 14:18:26 +00:00
Chad Rosier
ebe559019b [SLP] Update comment to reflect reality. NFC.
llvm-svn: 263548
2016-03-15 13:27:58 +00:00
Eric Christopher
257338ff0f Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Eric Christopher
ee00abe5e6 Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Teresa Johnson
b43027d1e0 Move global ID computation from Function to GlobalValue (NFC)
Since the static getGlobalIdentifier and getGUID methods are now called
for global values other than functions, reflect that by moving these
methods to the GlobalValue class.

llvm-svn: 263524
2016-03-15 02:13:19 +00:00
Teresa Johnson
26ab5772b0 [ThinLTO] Renaming of function index to module summary index (NFC)
(Resubmitting after fixing missing file issue)

With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263513
2016-03-15 00:04:37 +00:00
Justin Lebar
6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Amaury Sechet
bdb261b4c0 Imporove load to store => memcpy
Summary: This now try to reorder instructions in order to help create the optimizable pattern.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Differential Revision: http://reviews.llvm.org/D16523

llvm-svn: 263503
2016-03-14 22:52:27 +00:00
Teresa Johnson
cec0cae313 Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"
This reverts commit r263490. Missed a file.

llvm-svn: 263493
2016-03-14 21:18:10 +00:00
Teresa Johnson
892920b358 [ThinLTO] Renaming of function index to module summary index (NFC)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263490
2016-03-14 21:05:56 +00:00
Adam Nemet
bb45810e4f Revert "Turn LoopLoadElimination on again"
This reverts commit r263472.

There is an LNT failure on clang-ppc64be-linux-lnt.  Turn this off,
while I am investigating.

llvm-svn: 263485
2016-03-14 20:38:55 +00:00
Sanjay Patel
ee52b6e77d allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Justin Lebar
9d94397859 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

llvm-svn: 263481
2016-03-14 20:18:54 +00:00
Keno Fischer
a91ae8336b [SLPVectorizer] Fix dependency list
Summary:
DemandedBits was added to the requirements of SLPVectorizer in rL261212
(and various earlier version of it), but the appropriate initialization
statement was accidentally forgotten.

Ref [[ https://github.com/JuliaLang/julia/issues/14998 | JuliaLang/julia#14998 ]].

Patch by Yichao Yu.
Reviewers: mssimpso
Differential Revision: http://reviews.llvm.org/D18152

llvm-svn: 263476
2016-03-14 20:04:24 +00:00
Adam Nemet
5a19ae917b Turn LoopLoadElimination on again
The two issues that were discovered got fixed (r263058, r263173).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263472
2016-03-14 19:40:25 +00:00
Chad Rosier
00bd82cade [CVP] Replace nonnegative with positive, per Philip's request. NFC.
llvm-svn: 263430
2016-03-14 13:48:00 +00:00
Haicheng Wu
d60ae33d29 [CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative
The motivating example is this

for (j = n; j > 1; j = i) {
   i = j / 2;
}

The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.

llvm-svn: 263406
2016-03-14 03:24:28 +00:00
Mehdi Amini
ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Sanjay Patel
5658b58936 remove unnecessary cast; NFC
llvm-svn: 263343
2016-03-12 18:17:41 +00:00
Sanjay Patel
2e0027706a fix formatting; NFC
llvm-svn: 263342
2016-03-12 18:05:53 +00:00
Sanjay Patel
5781d840dd use range loops; NFCI
llvm-svn: 263341
2016-03-12 16:52:17 +00:00
Sanjay Patel
c4acbae63f [x86, InstCombine] delete x86 SSE2 masked store with zero mask
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.

llvm-svn: 263340
2016-03-12 15:16:59 +00:00
Eric Christopher
35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
George Burgess IV
b42b762bca [MemorySSA] Make a return type reflect reality. NFC.
llvm-svn: 263286
2016-03-11 19:34:03 +00:00
Sanjoy Das
b51325dbdb Introduce @llvm.experimental.deoptimize
Summary:
This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality.  In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Note: this change does not address how `@llvm.experimental_deoptimize`
is lowered.  That will be done in a later change.

Reviewers: chandlerc, rnk, atrick, reames

Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet

Differential Revision: http://reviews.llvm.org/D17732

llvm-svn: 263281
2016-03-11 19:08:34 +00:00
Vedant Kumar
e5a9a275d3 [PGO] Skip value profile instrumentation of inline asm
Value profile instrumentation treats inline asm calls like they are
indirect calls. This causes problems when the 'Callee' is passed to a
ptrtoint cast -- the verifier rightly claims that this is bogus and
crashes opt.

llvm-svn: 263278
2016-03-11 18:57:48 +00:00
Teresa Johnson
76a1c1d0ba [ThinLTO] Support for reference graph in per-module and combined summary.
Summary:
This patch adds support for including a full reference graph including
call graph edges and other GV references in the summary.

The reference graph edges can be used to make importing decisions
without materializing any source modules, can be used in the plugin
to make file staging decisions for distributed build systems, and is
expected to have other uses.

The call graph edges are recorded in each function summary in the
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
there is PGO, where the ValueId can be mapped to the function GUID via
the ValueSymbolTable. In the function index in memory, the call graph
edges reference the target via the CalleeGUID instead of the
CalleeValueId.

The reference graph edges are recorded in each summary record with a
list of referenced value IDs, which can be mapped to value GUID via the
ValueSymbolTable.

Addtionally, a new summary record type is added to record references
from global variable initializers. A number of bitcode records and data
structures have been renamed to reflect the newly expanded scope of the
summary beyond functions. More cleanup will follow.

Reviewers: joker.eph, davidxl

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17212

llvm-svn: 263275
2016-03-11 18:52:24 +00:00
Mehdi Amini
99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Mehdi Amini
1e9c925182 Do not specialize IRBuilder to strip names in SROA
Summary:
Following r263086, we are replacing this by a runtime check.
More cleanup will follow on the IRBuilder itself, but I submitted
this patch separately as SROA has a fancy "prefixInserter" class
that needs extra-love.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18022

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263256
2016-03-11 17:15:34 +00:00
Chandler Carruth
ace8c8f765 [PM] Sink the "Expression" type for GVN into the class as a private
member type.

Because of how this type is used by the ValueTable, it cannot actually
have hidden visibility. GCC actually nicely warns about this but Clang
just silently ... I don't even know. =/ We should do a better job either
way though.

This should resolve a bunch of the GCC warnings about visibility that
the port of GVN triggered and make the visibility story a bit more
correct.

llvm-svn: 263250
2016-03-11 16:25:19 +00:00
Chandler Carruth
3bc9c7fb45 [PM] The order of evaluation of these analyses is actually significant,
much to my horror, so use variables to fix it in place.

This terrifies me. Both basic-aa and memdep will provide more precise
information when the domtree and/or the loop info is available. Because
of this, if your pass (like GVN) requires domtree, and then queries
memdep or basic-aa, it will get more precise results. If it does this in
the other order, it gets less precise results.

All of the ideas I have for fixing this are, essentially, terrible. Here
I've just caused us to stop having unspecified behavior as different
implementations evaluate the order of these arguments differently. I'm
actually rather glad that they do, or the fragility of memdep and
basic-aa would have gone on unnoticed. I've left comments so we don't
immediately break this again. This should fix bots whose host compilers
evaluate the order of arguments differently from Clang.

llvm-svn: 263231
2016-03-11 13:26:47 +00:00
Chandler Carruth
b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Benjamin Kramer
c126353473 [InstCombine] Use Twines to generate names.
Since the names are used in a loop this does more work in debug builds. In
release builds value names are generally discarded so we don't have to do
the concatenation at all. It's also simpler code, no functional change
intended.

llvm-svn: 263215
2016-03-11 10:20:56 +00:00
Chandler Carruth
89c45a162f [PM] Port GVN to the new pass manager, wire it up, and teach a couple of
tests to run GVN in both modes.

This is mostly the boring refactoring just like SROA and other complex
transformation passes. There is some trickiness in that GVN's
ValueNumber class requires hand holding to get to compile cleanly. I'm
open to suggestions about a better pattern there, but I tried several
before settling on this. I was trying to balance my desire to sink as
much implementation detail into the source file as possible without
introducing overly many layers of abstraction.

Much like with SROA, the design of this system is made somewhat more
cumbersome by the need to support both pass managers without duplicating
the significant state and logic of the pass. The same compromise is
struck here.

I've also left a FIXME in a doxygen comment as the GVN pass seems to
have pretty woeful documentation within it. I'd like to submit this with
the FIXME and let those more deeply familiar backfill the information
here now that we have a nice place in an interface to put that kind of
documentaiton.

Differential Revision: http://reviews.llvm.org/D18019

llvm-svn: 263208
2016-03-11 08:50:55 +00:00
Pete Cooper
adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Adam Nemet
efb234135c [LLE] Add missed LoopSimplify dependence
The code assumed that we always had a preheader without making the pass
dependent on LoopSimplify.

Thanks to Mattias Eriksson V for reporting this.

llvm-svn: 263173
2016-03-10 23:54:39 +00:00
Chandler Carruth
37f1f12226 [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

llvm-svn: 263121
2016-03-10 15:31:17 +00:00
Chandler Carruth
d94a5962cc [SROA] Clean up some really weird code, no functionality changed.
We already have the instruction extracted into 'I', just cast that to
a store the way we do for loads. Also, we don't enter the if unless SI
is non-null, so don't test it again for null.

I'm pretty sure the entire test there can be nuked, but this is just the
trivial cleanup.

llvm-svn: 263112
2016-03-10 14:16:18 +00:00
Michael Zolotukhin
b88fbe08fc [SLP] Add -slp-min-reg-size command line option.
MinVecRegSize is currently hardcoded to 128; this patch adds a cl::opt
to allow changing it. I tried not to change any existing behavior for the default
case.

Differential revision: http://reviews.llvm.org/D13278

llvm-svn: 263089
2016-03-10 02:49:47 +00:00
Chandler Carruth
7776377e62 [gvn] Fix more indenting and formatting in regions of code that will
need to be changed for porting to the new pass manager.

Also sink the comment on the ValueTable class back to that class instead
of it dangling on an anonymous namespace.

No functionality changed.

llvm-svn: 263084
2016-03-10 00:58:20 +00:00
Chandler Carruth
169c84f1cc [gvn] Reformat a chunk of the GVN code that is strangely indented prior
to restructuring it for porting to the new pass manager.

No functionality changed.

llvm-svn: 263083
2016-03-10 00:58:18 +00:00
Chandler Carruth
61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames
e0a5454df4 Fix the build
I screwed up rebasing 263072.  This change fixes the build and passes all make check.

llvm-svn: 263073
2016-03-09 23:07:53 +00:00
Philip Reames
b54c8e6eea [LICM] Store promotion when memory is thread local
This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop.

Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped.

In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem.

Differential Revision: http://reviews.llvm.org/D16783

llvm-svn: 263072
2016-03-09 22:59:30 +00:00
Philip Reames
8f12eba78d [ValueTracking] Extract isKnownPositive [NFCI]
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.

llvm-svn: 263062
2016-03-09 21:31:47 +00:00
Philip Reames
ec8a8b5437 [InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)
When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min.

Differential Revision: http://reviews.llvm.org/D17873

llvm-svn: 263059
2016-03-09 21:05:07 +00:00
Adam Nemet
660748ca8c [LLE] Add missing check for unit stride
I somehow missed this.  The case in GCC (global_alloc) was similar to
the new testcase except it had an array of structs rather than a two
dimensional array.

Fixes RP26885.

llvm-svn: 263058
2016-03-09 20:47:55 +00:00
Matthias Braun
c31032d607 InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2
As part of r251146 InstCombine was extended to call computeKnownBits on
every value in the function to determine whether it happens to be
constant. This increases typical compiletime by 1-3% (5% in irgen+opt
time) in my measurements. On the other hand this case did not trigger
once in the whole llvm-testsuite.

This patch introduces the notion of ExpensiveCombines which are only
enabled for OptLevel > 2. I removed the check in InstructionSimplify as
that is called from various places where the OptLevel is not known but
given the rarity of the situation I think a check in InstCombine is
enough.

Differential Revision: http://reviews.llvm.org/D16835

llvm-svn: 263047
2016-03-09 18:47:11 +00:00
Petar Jovanovic
921c2b4eb3 Reland r262337 "calculate builtin_object_size if arg is a removable pointer"
Original commit message:
 calculate builtin_object_size if argument is a removable pointer

 This patch fixes calculating correct value for builtin_object_size function
 when pointer is used only in builtin_object_size function call and never
 after that.

 Patch by Strahinja Petrovic.

 Differential Revision: http://reviews.llvm.org/D17337

Reland the original change with a small modification (first do a null check
and then do the cast) to satisfy ubsan.

llvm-svn: 263011
2016-03-09 14:12:47 +00:00
Adam Nemet
34785ecff1 [LoopDataPrefetch] Add stats and debug output
llvm-svn: 262998
2016-03-09 05:33:21 +00:00