24929 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
75a4f35b26 Intrinsics: introduce llvm_any_ty aka ValueType Any
Specifically, gc.result benefits from this greatly. Instead of:

gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...

We now have a gc.result.* that can specialize to literally any type.

Differential Revision: http://reviews.llvm.org/D7020

llvm-svn: 226857
2015-01-22 20:14:38 +00:00
Sanjay Patel
37c41c1d2c merge consecutive stores of extracted vector elements (PR21711)
This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. 
That patch was checked in at r224611, but reverted at r225031 because it
caused a failure outside of the regression tests.

The cause of the crash was not recognizing consecutive stores that have mixed
source values (loads and vector element extracts), so this patch adds a check
to bail out if any store value is not coming from a vector element extract.

This patch also refactors the shared logic of the constant source and vector
extracted elements source cases into a helper function.

Differential Revision: http://reviews.llvm.org/D6850
 

llvm-svn: 226845
2015-01-22 18:21:26 +00:00
David Blaikie
e7d473461e Revert "PR21408: Workaround the appearance of duplicate variables due to problems when inlining two calls to the same function from the same call site."
The underlying bug has been fixed in r226736 so there's no need to
workaround this anymore.

This reverts commit r220923.

llvm-svn: 226842
2015-01-22 17:49:59 +00:00
Adrian Prantl
2585a98d38 Rename DIExpressionIterator to DIExpression::iterator.
Addresses review feedback from Duncan.

llvm-svn: 226835
2015-01-22 16:55:20 +00:00
Michael Kuperstein
25e34d11f3 [DAGCombine] Produce better code for constant splats
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.

Differential Revision: http://reviews.llvm.org/D7093

Fixed recommit of r226811.

llvm-svn: 226816
2015-01-22 13:07:28 +00:00
Michael Kuperstein
ff74032018 Revert r226811, MSVC accepts code sane compilers don't.
llvm-svn: 226814
2015-01-22 12:48:07 +00:00
Michael Kuperstein
84fad3e5c9 [DAGCombine] Produce better code for constant splats
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.

Differential Revision: http://reviews.llvm.org/D7093

llvm-svn: 226811
2015-01-22 12:37:23 +00:00
Elena Demikhovsky
150d9f3187 Fixed a bug in type legalizer for masked load/store intrinsics.
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.

llvm-svn: 226808
2015-01-22 12:07:59 +00:00
Elena Demikhovsky
94cfbbab33 Fixed a comment
llvm-svn: 226806
2015-01-22 10:01:36 +00:00
Elena Demikhovsky
9c26462a27 Fixed a bug in narrowing store operation.
Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type,
since the size of VT (1 bit) is not equal to its actual store size(8 bits).

Added a test provided by David (dag@cray.com)

llvm-svn: 226805
2015-01-22 09:39:08 +00:00
Reid Kleckner
f690f50519 Win64 SEH: Emit the constant 1 for catch-all into xdata
llvm-svn: 226767
2015-01-22 02:27:44 +00:00
Adrian Prantl
531641a0c6 Make DwarfExpression use the new DIExpressionIterator. NFC.
llvm-svn: 226748
2015-01-22 00:00:59 +00:00
Tim Northover
3007ba0ab3 DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))
It can help with argument juggling on some targets, and is generally a good
idea.

llvm-svn: 226740
2015-01-21 23:17:19 +00:00
Matthias Braun
c1988f384c LiveIntervalAnalysis: Mark subregister defs as undef when we determined they are only reading a dead superregister value
This was not necessary before as this case can only be detected when the
liveness analysis is at subregister level.

llvm-svn: 226733
2015-01-21 22:55:13 +00:00
Matthias Braun
311730ac78 LiveIntervalAnalysis: Factor out code to update liveness on vreg def removal
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.

This also fixes a case where SplitEditor::removeBackCopies() would miss
the subregister ranges.

llvm-svn: 226690
2015-01-21 19:02:30 +00:00
Matthias Braun
cfb8ad29b5 LiveIntervalAnalysis: Factor out code to update liveness on physreg def removal
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.

llvm-svn: 226687
2015-01-21 18:50:21 +00:00
Matthias Braun
1002baf7b9 LiveIntervalAnalysis: Remove unused pruneValue() variant.
llvm-svn: 226686
2015-01-21 18:45:57 +00:00
Tim Northover
cf3d80fedb Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))"
It hadn't gone through review yet, but was still on my local copy.

This reverts commit r226663

llvm-svn: 226665
2015-01-21 15:48:52 +00:00
Tim Northover
85cd2791c9 DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))
llvm-svn: 226663
2015-01-21 15:43:28 +00:00
Daniel Jasper
6b77455f81 Prevent binary-tree deterioration in sparse switch statements.
This addresses part of llvm.org/PR22262. Specifically, it prevents
considering the densities of sub-ranges that have fewer than
TLI.getMinimumJumpTableEntries() elements. Those densities won't help
jump tables.

This is not a complete solution but works around the most pressing
issue.

Review: http://reviews.llvm.org/D7070
llvm-svn: 226600
2015-01-20 19:43:33 +00:00
Daniel Jasper
d106b734cf Factor out a splitSwitchCase() function so that it can be reused.
This is in preparation for a fix to llvm.org/PR22262. One of the ideas
here is to first find a good jump table range first and then split
before and after it. Thereby, we don't need to use the
split-based-on-density heuristic at all, which can make the "binary
tree" deteriorate in various cases.

Also some minor cleanups.

No functional changes.

llvm-svn: 226551
2015-01-20 08:57:44 +00:00
Chandler Carruth
10f28f26fd [PM] Replace the Pass argument in MergeBasicBlockIntoOnlyPred with
a DominatorTree argument as that is the analysis that it wants to
update.

This removes the last non-loop utility function in Utils/ which accepts
a raw Pass argument.

llvm-svn: 226537
2015-01-20 01:37:09 +00:00
Adrian Prantl
5883af3faa Remove support for DIVariable's FlagIndirectVariable and expect
frontends to use a DIExpression with a DW_OP_deref instead.

This is not only a much more natural place for this informationl; there
is also a technical reason: The FlagIndirectVariable is used to mark a
variable that is turned into a reference by virtue of the calling
convention; this happens for example to aggregate return values.
The inliner, for example, may actually need to undo this indirection to
correctly represent the value in its new context. This is impossible to
implement because the DIVariable can't be safely modified. We can however
safely construct a new DIExpression on the fly.

llvm-svn: 226476
2015-01-19 17:57:29 +00:00
Rafael Espindola
12ca34f53f Bring r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats when
needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

llvm-svn: 226467
2015-01-19 15:16:06 +00:00
Chandler Carruth
37df2cfbf8 [PM] Remove the Pass argument from all of the critical edge splitting
APIs and replace it and numerous booleans with an option struct.

The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.

The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.

This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.

llvm-svn: 226456
2015-01-19 12:09:11 +00:00
Michael Kuperstein
54c61edee7 [MIScheduler] Slightly better handling of constrainLocalCopy when both source and dest are local
This fixes PR21792.

Differential Revision: http://reviews.llvm.org/D6823

llvm-svn: 226433
2015-01-19 07:30:47 +00:00
David Blaikie
9459832ebd std::unique_ptrify the MCStreamer argument to createAsmPrinter
llvm-svn: 226414
2015-01-18 20:29:04 +00:00
Mehdi Amini
37f316afaf Improve DAG combine pass on certain IR vector patterns
Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector
produced suboptimal code in AVX2 mode with certain IR combinations.

In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32
(undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical,
but then mysteriously generated rather bad code; the movq/movhpd combination
didn't match.

The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs
would get promoted to 4f32 by the type legalizer, eventually resulting
in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing
these were both half the output size, concatted them and then produced
a shuffle. However, the resulting concat + shuffle was more complex than
it should be; in the case where the upper half of the output is undef, we
probably want to generate shuffle + concat instead.

This enhancement causes the vector_shuffle combine step to recognize this
suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR
in case the same suboptimal pattern occurs for other reasons.

This results in the optimizer correctly producing the optimal movq + movhpd
sequence for all three variations on this IR, even with AVX2.

I've included a test case.

Radar link: rdar://problem/19287012
Fix for PR 21943.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 226360
2015-01-17 01:35:56 +00:00
Matthias Braun
7618b2b23d RegisterCoalescer: Cleanup and improved comment for a subtle detail.
llvm-svn: 226353
2015-01-17 00:33:13 +00:00
Matthias Braun
0eb940aed0 RegisterCoalescer: Cleanup by factoring out a common expression
llvm-svn: 226352
2015-01-17 00:33:11 +00:00
Matthias Braun
e2fa081615 RegisterCoalescer: Cleanup comment style
- Consistenly put comments above the function declaration, not the
  definition. To achieve this some duplicate comments got merged and
  some comment parts describing implementation details got moved into their
  functions.
- Consistently use doxygen comments above functions.
- Do not use doxygen comments inside functions.

llvm-svn: 226351
2015-01-17 00:33:09 +00:00
Matthias Braun
fc6ef3a270 RegisterCoalescer: Drive-by typo + whitespace fix
llvm-svn: 226350
2015-01-17 00:33:06 +00:00
Philip Reames
287987ca13 Update a comment
Be a bit more explicit about the fact that addrspace(1) is not reserved.

llvm-svn: 226344
2015-01-16 23:21:07 +00:00
Philip Reames
36319538d0 clang-format all the GC related files (NFC)
Nothing interesting here...

llvm-svn: 226342
2015-01-16 23:16:12 +00:00
Philip Reames
2b45395876 Move ownership of GCStrategy objects to LLVMContext
Note: This change ended up being slightly more controversial than expected.  Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions.

Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance.

In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.)

Differential Revision: http://reviews.llvm.org/D6811

llvm-svn: 226311
2015-01-16 20:07:33 +00:00
Philip Reames
7de640a876 Remove gc.root's findCustomSafePoints mechanism
Searching all of the existing gc.root implementations I'm aware of (all three of them), there was exactly one use of this mechanism, and that was to implement a performance improvement that should have been applied to the default lowering.

Having this function is requiring a dependency on a CodeGen class (MachineFunction), in a class which is otherwise completely independent of CodeGen. I could solve this differently, but given that I see absolutely no value in preserving this mechanism, I going to just get rid of it.

Note: Tis is the first time I'm intentionally breaking previously supported gc.root functionality. Given 3.6 has branched, I believe this is a good time to do this.

Differential Revision: http://reviews.llvm.org/D7004

llvm-svn: 226305
2015-01-16 19:33:28 +00:00
Timur Iskhodzhanov
60b721363c Revert r226242 - Revert Revert Don't create new comdats in CodeGen
This breaks AddressSanitizer (ninja check-asan) on Windows

llvm-svn: 226251
2015-01-16 08:38:45 +00:00
Rafael Espindola
67a79e72f5 Revert "Revert Don't create new comdats in CodeGen"
This reverts commit r226173, adding r226038 back.

No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

llvm-svn: 226242
2015-01-16 02:22:55 +00:00
Hal Finkel
5ef58eb86d Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers""
Reapply r226071 with fixes. Two fixes:

 1. We need to manually remove the old and create the new 'deaf defs'
    associated with physical register definitions when we move the definition of
    the physical register from the copy point to the point of the original vreg def.

    This problem was picked up by the machinstr verifier, and could trigger a
    verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've
    turned on the verifier in the tests.

 2. When moving the def point of the phys reg up, we need to make sure that it
    is neither defined nor read in between the two instructions. We don't, however,
    extend the live ranges of phys reg defs to cover uses, so just checking for
    live-range overlap between the pair interval and the phys reg aliases won't
    pick up reads. As a result, we manually iterate over the range and check for
    reads.

    A test soon to be committed to the PowerPC backend will test this change.

Original commit message:

[RegisterCoalescer] Remove copies to reserved registers

This allows the RegisterCoalescer to join "non-flipped" range pairs with a
physical destination register -- which allows the RegisterCoalescer to remove
copies like this:

<vreg> = something (maybe a load, for example)
... (things that don't use PHYSREG)
PHYSREG = COPY <vreg>

(with all of the restrictions normally applied by the RegisterCoalescer: having
compatible register classes, etc. )

Previously, the RegisterCoalescer handled only the opposite case (copying
*from* a physical register). I don't handle the problem fully here, but try to
get the common case where there is only one use of <vreg> (the COPY).

An upcoming commit to the PowerPC backend will make this pattern much more
common on PPC64/ELF systems.

llvm-svn: 226200
2015-01-15 20:32:09 +00:00
Philip Reames
66c9fb0d52 Style cleanup of old gc.root lowering code
Use static functions for helpers rather than static member functions.  a) this changes the linking (minor at best), and b) this makes it obvious no object state is involved.

llvm-svn: 226198
2015-01-15 19:49:25 +00:00
Philip Reames
b87144160e clang-format GCStrategy.cpp & GCRootLowering.cpp (NFC)
llvm-svn: 226196
2015-01-15 19:39:17 +00:00
Philip Reames
f27f373895 Split GCStrategy.cpp into two files (NFC)
This preparation for an update to http://reviews.llvm.org/D6811.  GCStrategy.cpp will hopefully be moving into IR/, where as the lowering logic needs to stay in CodeGen/

llvm-svn: 226195
2015-01-15 19:29:42 +00:00
Timur Iskhodzhanov
f5adf13fac Revert Don't create new comdats in CodeGen
It breaks AddressSanitizer on Windows.

llvm-svn: 226173
2015-01-15 16:14:34 +00:00
Mehdi Amini
fa546b29a0 Fix SelectionDAG -view-*-dags filtering
llvm-svn: 226163
2015-01-15 12:03:32 +00:00
Alexander Kornienko
8c0809c7f8 Replace size method call of containers to empty method where appropriate
This patch was generated by a clang tidy checker that is being open sourced.
The documentation of that checker is the following:

/// The emptiness of a container should be checked using the empty method
/// instead of the size method. It is not guaranteed that size is a
/// constant-time function, and it is generally more efficient and also shows
/// clearer intent to use empty. Furthermore some containers may implement the
/// empty method but not implement the size method. Using empty whenever
/// possible makes it easier to switch to another container in the future.

Patch by Gábor Horváth!

llvm-svn: 226161
2015-01-15 11:41:30 +00:00
Chandler Carruth
b98f63dbdb [PM] Separate the TargetLibraryInfo object from the immutable pass.
The pass is really just a means of accessing a cached instance of the
TargetLibraryInfo object, and this way we can re-use that object for the
new pass manager as its result.

Lots of delta, but nothing interesting happening here. This is the
common pattern that is developing to allow analyses to live in both the
old and new pass manager -- a wrapper pass in the old pass manager
emulates the separation intrinsic to the new pass manager between the
result and pass for analyses.

llvm-svn: 226157
2015-01-15 10:41:28 +00:00
Hal Finkel
dd669615dd Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"
Reverting this while I investigate some bad behavior this is causing. As a
possibly-related issue, adding -verify-machineinstrs to one of the test cases
now fails because of this change:

  llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs

*** Bad machine code: No instruction at def index ***
- function:    foo
- basic block: BB#0 return (0x10007e21f10) [0B;736B)
- liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
- register:    %DS
Valno #3 is defined at 624r

*** Bad machine code: Live segment doesn't end at a valid instruction ***
- function:    foo
- basic block: BB#0 return (0x10007e21f10) [0B;736B)
- liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
- register:    %DS
[624r,624d:3)
LLVM ERROR: Found 2 machine code errors.

where 624r corresponds exactly to the interval combining change:

624B    %RSP<def> = COPY %vreg16; GR64:%vreg16
        Considering merging %vreg16 with %RSP
                RHS = %vreg16 [608r,624r:0)  0@608r
                updated: 608B   %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1]
        Success: %vreg16 -> %RSP
        Result = %RSP

llvm-svn: 226086
2015-01-15 03:08:59 +00:00
Chandler Carruth
62d4215baa [PM] Move TargetLibraryInfo into the Analysis library.
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.

This is in preparation for porting this analysis to the new pass
manager.

No functionality changed, and updates inbound for Clang and Polly.

llvm-svn: 226078
2015-01-15 02:16:27 +00:00
NAKAMURA Takumi
95b3880dd0 Win64Exception.cpp: Try to fix crash for x64 EH. "Per" might be null there.
llvm-svn: 226077
2015-01-15 02:15:21 +00:00
Hal Finkel
8299646236 [RegisterCoalescer] Remove copies to reserved registers
This allows the RegisterCoalescer to join "non-flipped" range pairs with a
physical destination register -- which allows the RegisterCoalescer to remove
copies like this:

<vreg> = something (maybe a load, for example)
... (things that don't use PHYSREG)
PHYSREG = COPY <vreg>

(with all of the restrictions normally applied by the RegisterCoalescer: having
compatible register classes, etc. )

Previously, the RegisterCoalescer handled only the opposite case (copying
*from* a physical register). I don't handle the problem fully here, but try to
get the common case where there is only one use of <vreg> (the COPY).

An upcoming commit to the PowerPC backend will make this pattern much more
common on PPC64/ELF systems.

llvm-svn: 226071
2015-01-15 01:25:28 +00:00