185899 Commits

Author SHA1 Message Date
David Zarzycki
7653ff398d [X86] Enable AVX512BW for memcmp()
llvm-svn: 373845
2019-10-06 10:25:52 +00:00
Matt Arsenault
e59296a051 AMDGPU/GlobalISel: Fall back on weird G_EXTRACT offsets
llvm-svn: 373842
2019-10-06 01:41:22 +00:00
Matt Arsenault
786a3953ba AMDGPU/GlobalISel: RegBankSelect mul24 intrinsics
llvm-svn: 373841
2019-10-06 01:37:39 +00:00
Matt Arsenault
c0ec72d4f8 AMDGPU/GlobalISel: RegBankSelect DS GWS intrinsics
llvm-svn: 373840
2019-10-06 01:37:38 +00:00
Matt Arsenault
bcd6b1d209 AMDGPU/GlobalISel: Lower G_ATOMIC_CMPXCHG_WITH_SUCCESS
llvm-svn: 373839
2019-10-06 01:37:37 +00:00
Matt Arsenault
a5b9c75674 GlobalISel: Partially implement lower for G_EXTRACT
Turn into shift and truncate. Doesn't yet handle pointers.

llvm-svn: 373838
2019-10-06 01:37:35 +00:00
Matt Arsenault
69c65a8609 AMDGPU/GlobalISel: Fix RegBankSelect for sendmsg intrinsics
This wasn't updated for the immarg handling change.

llvm-svn: 373837
2019-10-06 01:37:34 +00:00
Craig Topper
2decdf42b9 [FastISel] Copy the inline assembly dialect to the INLINEASM instruction.
Fixes PR43575.

llvm-svn: 373836
2019-10-05 23:21:17 +00:00
Simon Pilgrim
8815be04ec [X86][AVX] Push sign extensions of comparison bool results through bitops (PR42025)
As discussed on PR42025, with more complex boolean math we can end up with many truncations/extensions of the comparison results through each bitop.

This patch handles the cases introduced in combineBitcastvxi1 by pushing the sign extension through the AND/OR/XOR ops so its just the original SETCC ops that gets extended.

Differential Revision: https://reviews.llvm.org/D68226

llvm-svn: 373834
2019-10-05 20:49:34 +00:00
Sanjay Patel
e2321bb448 [SLP] avoid reduction transform on patterns that the backend can load-combine
I don't see an ideal solution to these 2 related, potentially large, perf regressions:
https://bugs.llvm.org/show_bug.cgi?id=42708
https://bugs.llvm.org/show_bug.cgi?id=43146

We decided that load combining was unsuitable for IR because it could obscure other
optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
Therefore, preventing SLP from destroying load combine opportunities requires that it
recognizes patterns that could be combined later, but not do the optimization itself (
it's not a vector combine anyway, so it's probably out-of-scope for SLP).

Here, we add a scalar cost model adjustment with a conservative pattern match and cost
summation for a multi-instruction sequence that can probably be reduced later.
This should prevent SLP from creating a vector reduction unless that sequence is
extremely cheap.

In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
will produce a single instruction on these tests like:

  movbe   rax, qword ptr [rdi]

or:

  mov     rax, qword ptr [rdi]

Not some (half) vector monstrosity as we currently do using SLP:

  vpmovzxbq       ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
  vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
  movzx   eax, byte ptr [rdi]
  movzx   ecx, byte ptr [rdi + 5]
  shl     rcx, 40
  movzx   edx, byte ptr [rdi + 6]
  shl     rdx, 48
  or      rdx, rcx
  movzx   ecx, byte ptr [rdi + 7]
  shl     rcx, 56
  or      rcx, rdx
  or      rcx, rax
  vextracti128    xmm1, ymm0, 1
  vpor    xmm0, xmm0, xmm1
  vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
  vpor    xmm0, xmm0, xmm1
  vmovq   rax, xmm0
  or      rax, rcx
  vzeroupper
  ret

Differential Revision: https://reviews.llvm.org/D67841

llvm-svn: 373833
2019-10-05 18:03:58 +00:00
Simon Pilgrim
9ecacb0d54 [X86] lowerShuffleAsLanePermuteAndRepeatedMask - variable renames. NFCI.
Rename some variables to match lowerShuffleAsRepeatedMaskAndLanePermute - prep work toward adding some equivalent sublane functionality.

llvm-svn: 373832
2019-10-05 16:08:30 +00:00
David Bolvansky
41c934acaf [SelectionDAG] Add tests for LKK algorithm
Added some tests testing urem and srem operations with a constant divisor.

Patch by TG908 (Tim Gymnich)

Differential Revision: https://reviews.llvm.org/D68421

llvm-svn: 373830
2019-10-05 14:29:25 +00:00
Simon Pilgrim
f609c0a303 BranchFolding - IsBetterFallthrough - assert non-null pointers. NFCI.
Silences static analyzer null dereference warnings.

llvm-svn: 373823
2019-10-05 13:20:30 +00:00
James Molloy
b1f0183e57 [UnitTests] Try and pacify gcc-5
This looks like a defect in gcc-5 where it chooses a constexpr
constructor from the initializer-list that it considers to be explicit.

I've tried to reproduce but I can't install anything prior to gcc-6 easily
on my system, and that doesn't have the error. So this is speculative
pacification.

Reported by Steven Wan.

llvm-svn: 373820
2019-10-05 08:57:17 +00:00
Mehdi Amini
482f4d9aa9 Expose ProvidePositionalOption as a public API
The motivation is to reuse the key value parsing logic here to
parse instance specific pass options within the context of MLIR.
The primary functionality exposed is the "," splitting for
arrays and the logic for properly handling duplicate definitions
of a single flag.

Patch by: Parker Schuh <parkers@google.com>

Differential Revision: https://reviews.llvm.org/D68294

llvm-svn: 373815
2019-10-05 01:37:04 +00:00
Philip Reames
d5a4dad206 Fix a *nasty* miscompile in experimental unordered atomic lowering
This is an omission in rL371441.  Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block.

Included test case is how I spotted this.  We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with.  I'm sure it showed up a bunch of other ways as well.

Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode.  Finding this with testing would have been "unpleasant".  

llvm-svn: 373814
2019-10-05 00:32:10 +00:00
Philip Reames
9fe5d730c7 [Test] Add a test case fo a missed oppurtunity in implicit null checking
llvm-svn: 373813
2019-10-04 23:46:26 +00:00
Ana Pazos
ea835f5ce8 [RISCV] Added missing ImmLeaf predicates
simm9_lsb0 and simm12_lsb0 operand types were missing predicates.

llvm-svn: 373812
2019-10-04 23:42:07 +00:00
Aditya Kumar
50afaa9d34 Add a unittest to verify for assumption cache
Reviewers: vsk, tejohnson

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D68095

llvm-svn: 373811
2019-10-04 23:36:59 +00:00
Aditya Kumar
6a2673605e Invalidate assumption cache before outlining.
Subscribers: llvm-commits

Tags: #llvm

Reviewers: compnerd, vsk, sebpop, fhahn, tejohnson

Reviewed by: vsk

Differential Revision: https://reviews.llvm.org/D68478

llvm-svn: 373807
2019-10-04 22:46:42 +00:00
Reid Kleckner
67cfa79c01 Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
This reverts r371177 (git commit f879c6875563c0a8cd838f1e13b14dd33558f1f8)

It caused PR43566 by removing empty, address-taken MachineBasicBlocks.
Such blocks may have references from blockaddress or other operands, and
need more consideration to be removed.

See the PR for a test case to use when relanding.

llvm-svn: 373805
2019-10-04 22:24:21 +00:00
Roman Lebedev
fb5af8b9b9 [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
  https://rise4fun.com/Alive/kPg

llvm-svn: 373802
2019-10-04 22:16:22 +00:00
Roman Lebedev
f304d4d185 [InstCombine] Right-shift shift amount reassociation with truncation (PR43564, PR42391)
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.

https://rise4fun.com/Alive/GEw
https://rise4fun.com/Alive/gN2O

In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.

llvm-svn: 373801
2019-10-04 22:16:11 +00:00
Roman Lebedev
ae738641d5 [NFC][InstCombine] Autogenerate shift.ll test
llvm-svn: 373800
2019-10-04 22:15:57 +00:00
Roman Lebedev
007452532b [NFC][InstCombine] Autogenerate icmp-shr-lt-gt.ll test
llvm-svn: 373799
2019-10-04 22:15:49 +00:00
Roman Lebedev
3c56cc920f [NFC][InstCombine] Tests for bit test via highest sign-bit extract (w/ trunc) (PR43564)
https://rise4fun.com/Alive/x5IS

llvm-svn: 373798
2019-10-04 22:15:41 +00:00
Roman Lebedev
6a954748c8 [NFC][InstCombine] Tests for right-shift shift amount reassociation (w/ trunc) (PR43564, PR42391)
https://rise4fun.com/Alive/GEw

llvm-svn: 373797
2019-10-04 22:15:32 +00:00
Julian Lettner
68eefbb064 [lit] Use better name for "test in parallel" concept
In the past, lit used threads to run tests in parallel. Today we use
`multiprocessing.Pool`, which uses processes. Let's stay more abstract
and use "worker" everywhere.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D68475

llvm-svn: 373794
2019-10-04 21:40:20 +00:00
Jessica Paquette
784892c964 [MachineOutliner] Disable outlining from noreturn functions
Outlining from noreturn functions doesn't do the correct thing right now. The
outliner should respect that the caller is marked noreturn. In the event that
we have a noreturn function, and the outlined code is in tail position, the
outliner will not see that the outlined function should be tail called. As a
result, you end up with a regular call containing a return.

Fixing this requires that we check that all candidates live inside noreturn
functions. So, for the sake of correctness, don't outline from noreturn
functions right now.

Add machine-outliner-noreturn.mir to test this.

llvm-svn: 373791
2019-10-04 21:24:12 +00:00
Sanjay Patel
6e312388b6 [InstCombine] add tests for fneg disguised as fmul; NFC
llvm-svn: 373788
2019-10-04 20:54:14 +00:00
Huihui Zhang
da9e252491 [NFC] Add { } to silence compiler warning [-Wmissing-braces].
../llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:355:48: warning: suggest braces around initialization of subobject [-Wmissing-braces]
      return addMappingFromTable<1>(MI, MRI, { 0 }, Table);
                                               ^
                                               {}

llvm-svn: 373784
2019-10-04 20:04:34 +00:00
Eli Friedman
23ae13d51f [ScheduleDAG] When a node is cloned, add an edge between the nodes.
InstrEmitter's virtual register handling assumes that clones are emitted
after the cloned node.  Make sure this assumption actually holds.

Fixes a "Node emitted out of order - early" assertion on the testcase.

This is probably a very rare case to actually hit in practice; even
without the explicit edge, the scheduler will usually end up scheduling
the nodes in the expected order due to other constraints.

Differential Revision: https://reviews.llvm.org/D68068

llvm-svn: 373782
2019-10-04 19:51:40 +00:00
Martin Storsjo
174604c93c [test] Remove another two unnecessary uses of REQUIRES: target-windows. NFC.
Differential Revision: https://reviews.llvm.org/D68449

llvm-svn: 373780
2019-10-04 19:47:48 +00:00
Martin Storsjo
5b2e0ba28e [JITLink] Silence GCC warnings. NFC.
Use parentheses in an expression with mixed && and ||.

Differential Revision: https://reviews.llvm.org/D68447

llvm-svn: 373779
2019-10-04 19:47:42 +00:00
Craig Topper
87aa59a0c7 [X86] Remove isel patterns for mask vpcmpgt/vpcmpeq. Switch vpcmp to these based on the immediate in MCInstLower
The immediate form of VPCMP can represent these completely. The
vpcmpgt/eq are just shorter encodings.

This patch removes the isel patterns and just swaps the opcodes
and removes the immediate in MCInstLower. This matches where we do
some other encodings tricks.

Removes over 10K bytes from the isel table.

Differential Revision: https://reviews.llvm.org/D68446

llvm-svn: 373766
2019-10-04 18:02:46 +00:00
Craig Topper
074fa390d2 [X86] Add DAG combine to form saturating VTRUNCUS/VTRUNCS from VTRUNC
We already do this for ISD::TRUNCATE, but we can do the same for X86ISD::VTRUNC

Differential Revision: https://reviews.llvm.org/D68432

llvm-svn: 373765
2019-10-04 17:53:18 +00:00
Siva Chandra
4380647e79 Add few docs and implementation of strcpy and strcat.
Summary:
This patch illustrates some of the features like modularity we want
in the new libc. Few other ideas like different kinds of testing, redirectors
etc are not yet present.

Reviewers: dlj, hfinkel, theraven, jfb, alexshap, jdoerfert

Subscribers: mgorny, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67867

llvm-svn: 373764
2019-10-04 17:30:54 +00:00
James Molloy
717e540f7e [Automaton] Fix invalid iterator reference
Found by the expensive checks bot.

llvm-svn: 373763
2019-10-04 17:15:30 +00:00
James Molloy
9baac83a2e [ModuloSchedule] Do not remap terminators
This is a trivial point fix. Terminator instructions aren't scheduled, so
we shouldn't expect to be able to remap them.

This doesn't affect Hexagon and PPC because their terminators are always
hardware loop backbranches that have no register operands.

llvm-svn: 373762
2019-10-04 17:15:25 +00:00
Kevin P. Neal
68b8052121 [FPEnv] Strict FP tests should use the requisite function attributes.
A set of function attributes is required in any function that uses constrained
floating point intrinsics. None of our tests use these attributes.

This patch fixes this.

These tests have been tested against the IR verifier changes in D68233.

Reviewed by:	andrew.w.kaylor, cameron.mcinally, uweigand
Approved by:	andrew.w.kaylor
Differential Revision:	https://reviews.llvm.org/D67925

llvm-svn: 373761
2019-10-04 17:03:46 +00:00
Mikhail Maltsev
cfe8bedca0 [utils] Fix incompatibility of bisect[-skip-count] with Python 3
Summary:
This change replaces the print statements with print function calls
and also replaces the '/' operator (which is integer division in Py2,
but becomes floating point division in Py3) with the '//' operator
which has the same semantics in Py2 and Py3.

Reviewers: greened, michaelplatings, gottesmm

Reviewed By: greened

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68138

llvm-svn: 373759
2019-10-04 16:44:18 +00:00
Thomas Preud'homme
e37bc5e499 [NFC] [FileCheck] Reapply fix init of objects in unit tests
Summary:
Fix initialization style of objects allocated on the stack and member
objects in unit test to use the "Type Var(init list)" and
"Type Member{init list}" convention. The latter fixes the buildbot
breakage.

Reviewers: jhenderson, probinson, arichardson, grimar, jdenny

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68425

llvm-svn: 373755
2019-10-04 15:47:23 +00:00
Dmitry Preobrazhensky
434d59250e [AMDGPU][MC][GFX10][WS32] Corrected decoding of dst operand for v_cmp_*_sdwa opcodes
See bug 43484: https://bugs.llvm.org/show_bug.cgi?id=43484

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68349

llvm-svn: 373745
2019-10-04 13:04:17 +00:00
Simon Pilgrim
73be415dd6 Fix uninitialized variable warnings in directory_entry default constructor. NFCI
llvm-svn: 373742
2019-10-04 12:45:42 +00:00
Simon Pilgrim
84f5cd75b3 Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 373741
2019-10-04 12:45:27 +00:00
Dmitry Preobrazhensky
9bd763679f [AMDGPU][MC][GFX10] Enabled decoding of 'null' operand
See bug 43485: https://bugs.llvm.org/show_bug.cgi?id=43485

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68348

llvm-svn: 373740
2019-10-04 12:38:36 +00:00
Tim Northover
a7d90af1be ARM-Darwin: keep the frame register reserved even if not updated.
Darwin platforms need the frame register to always point at a valid record even
if it's not updated in a leaf function. Backtraces are more important than one
extra GPR.

llvm-svn: 373738
2019-10-04 12:29:32 +00:00
Owen Reynolds
e64369e76e [llvm-ar][test] Clarified comment
The test is dependant on the installation of the en_US.UTF-8
locale. The reasoning for this is clarified in the amended comment.

llvm-svn: 373737
2019-10-04 12:26:25 +00:00
Dmitry Preobrazhensky
94d040706d [AMDGPU][MC][GFX10] Corrected definition of FLAT GLOBAL/SCRATCH instructions
See bug 43483: https://bugs.llvm.org/show_bug.cgi?id=43483

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68347

llvm-svn: 373736
2019-10-04 12:10:22 +00:00
Simon Atanasyan
0d5250a858 [llvm-readobj] Remove redundant semicolon. NFC
llvm-svn: 373735
2019-10-04 12:08:10 +00:00