47516 Commits

Author SHA1 Message Date
Simon Pilgrim
5647e89f5a [X86] Split WriteCvtI2F/WriteCvtF2I into I<->F32 and I<->F64 scheduler classes
A lot of the models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first

llvm-svn: 332451
2018-05-16 10:53:45 +00:00
Amara Emerson
0d6a26dffc [GlobalISel][IRTranslator] Split aggregates during IR translation.
We currently handle all aggregates by creating one large LLT, and letting the
legalizer deal with splitting them up. However using this approach means that
we can't support big endian code correctly.

This patch changes the way that the IRTranslator deals with aggregate values,
by splitting them up into their constituent element values. To do this, parts
of the translator need to be modified to deal with multiple VRegs for a single
Value.

A new Value to VReg mapper is introduced to help keep compile time under
control, currently there is no measurable impact on CTMark despite the extra
code being generated in some cases.

Patch is based on the original work of Tim Northover.

Differential Revision: https://reviews.llvm.org/D46018

llvm-svn: 332449
2018-05-16 10:32:02 +00:00
Simon Dardis
5cf9de4b72 [mips] Add support for isBranchOffsetInRange and use it for MipsLongBranch
Add support for this target hook, covering MIPS, microMIPS and MIPSR6, along
with some tests. Also add missing getOppositeBranchOpc() cases exposed by the
tests.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46794

llvm-svn: 332446
2018-05-16 10:03:05 +00:00
Peter Smith
c811758da6 [AArch64] Support "S" inline assembler constraint
This patch re-introduces the "S" inline assembler constraint. This matches
an absolute symbolic address or a label reference. The primary use case is

asm("adrp %0, %1\n\t"
    "add %0, %0, :lo12:%1" : "=r"(addr) : "S"(&var));

I say re-introduces as it seems like "S" was implemented in the original
AArch64 backend, but it looks like it wasn't carried forward to the merged
backend. The original implementation had A and L modifiers that could be
used to print ":lo12:" to the string. It looks like gcc doesn't use these
and :lo12: is expected to be written in the inline assembly string so I've
not implemented A and L. Clang already supports the S modifier.

Fixes PR37180

Differential Revision: https://reviews.llvm.org/D46745

llvm-svn: 332444
2018-05-16 09:33:25 +00:00
Sander de Smalen
a680f558be [AArch64][SVE] Asm: Support for structured LD2, LD3 and LD4 (scalar+scalar) load instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46679

llvm-svn: 332442
2018-05-16 09:16:20 +00:00
Sander de Smalen
67f9154964 [AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46682

llvm-svn: 332433
2018-05-16 07:50:09 +00:00
Mikael Holmen
e01131decf Remove unused variable introduced in r332336
The unused variable caused a compilation warning:

../lib/Target/X86/X86ISelLowering.cpp:34614:17: error: unused variable 'SMax' [-Werror,-Wunused-variable]
    if (SDValue SMax = MatchMinMax(SMin, ISD::SMAX, C1))
                ^
1 error generated.

llvm-svn: 332431
2018-05-16 06:36:11 +00:00
Peter Collingbourne
ec8236ead1 ARM: Remove unnecessary argument. NFCI.
IsLittleEndian is already a field of ARMAsmBackend.

llvm-svn: 332420
2018-05-16 00:21:47 +00:00
Peter Collingbourne
76d463af0a ARM: Deduplicate code and remove unnecessary declaration. NFCI.
llvm-svn: 332419
2018-05-16 00:21:31 +00:00
Stanislav Mekhanoshin
57d341c27a [AMDGPU] Fix handling of void types in isLegalAddressingMode
It is legal for the type passed to isLegalAddressingMode to be
unsized or, more specifically, VoidTy. In this case, we must
check the legality of load / stores for all legal types. Directly
trying to call getTypeStoreSize is incorrect, and leads to breakage
in e.g. Loop Strength Reduction. This change guards against that
behaviour.

Differential Revision: https://reviews.llvm.org/D40405

llvm-svn: 332409
2018-05-15 22:07:51 +00:00
Evandro Menezes
8d522d811a [AArch64] Improve single vector lane unscaled stores
When storing the 0th lane of a vector, use a simpler and usually more
efficient scalar store instead.  In this case, also using the unscaled
offset.

Differential revision: https://reviews.llvm.org/D46762

llvm-svn: 332394
2018-05-15 20:41:12 +00:00
Peter Collingbourne
28d4ddab86 Nios2: Unbreak build.
llvm-svn: 332391
2018-05-15 20:21:58 +00:00
Chandler Carruth
5ecd81aab0 [x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering to
specially handle SETB_C* pseudo instructions.

Summary:
While the logic here is somewhat similar to the arithmetic lowering, it
is different enough that it made sense to have its own function.
I actually tried a bunch of different optimizations here and none worked
well so I gave up and just always do the arithmetic based lowering.

Looking at code from the PR test case, we actually pessimize a bunch of
code when generating these. Because SETB_C* pseudo instructions clobber
EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple
SETB_C* pseudos from a single set of EFLAGS. This in turn causes the
lowering code to ruin all the clever code generation that SETB_C* was
hoping to achieve. None of this is needed. Whenever we're generating
multiple SETB_C* instructions from a single set of EFLAGS we should
instead generate a single maximally wide one and extract subregs for all
the different desired widths. That would result in substantially better
code generation. But this patch doesn't attempt to address that.

The test case from the PR is included as well as more directed testing
of the specific lowering pattern used for these pseudos.

Reviewers: craig.topper

Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D46799

llvm-svn: 332389
2018-05-15 20:16:57 +00:00
Konstantin Zhuravlyov
f13c9969fc AMDGPU: Fix v_dot{4, 8}* instruction encoding
Differential Revision: https://reviews.llvm.org/D46848

llvm-svn: 332387
2018-05-15 19:32:47 +00:00
Tom Stellard
e182b28ae4 AMDGPU/GlobalISel: Implement select() for G_FCONSTANT
Summary: Also clean up G_CONSTANT selection.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46170

llvm-svn: 332379
2018-05-15 17:57:09 +00:00
Konstantin Zhuravlyov
603a43fcd5 AMDGPU: Add disasm tests for deep learning instructions + fix v_fmac_f32 disasm
Differential Revision: https://reviews.llvm.org/D46853

llvm-svn: 332377
2018-05-15 17:39:13 +00:00
Simon Pilgrim
be9a206883 [X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes
BtVer2 - Fixes schedules for (V)CVTPS2PD instructions

A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first

llvm-svn: 332376
2018-05-15 17:36:49 +00:00
Krzysztof Parzyszek
db39bf4088 [Hexagon] Remove unused function from subtarget
llvm-svn: 332369
2018-05-15 16:32:24 +00:00
Krzysztof Parzyszek
8c389bd368 [Hexagon] Remove unused flag from subtarget and (non)corresponding test
llvm-svn: 332365
2018-05-15 16:13:52 +00:00
Simon Dardis
f40eb03ce9 [mips] Mark select instructions correctly
Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46702

llvm-svn: 332364
2018-05-15 16:05:04 +00:00
Simon Pilgrim
891ebcdbaa [X86] Split off F16C WriteCvtPH2PS/WriteCvtPS2PH scheduler classes
Btver2 - VCVTPH2PSYrm needs to double pump the AGU
Broadwell - missing VCVTPS2PH*mr stores extra latency

Allows us to remove the WriteCvtF2FSt conversion store class

llvm-svn: 332357
2018-05-15 14:12:32 +00:00
Simon Dardis
ce5d3d657a [mips] Fix formatting of floating point conversion patterns
llvm-svn: 332341
2018-05-15 11:21:07 +00:00
Simon Dardis
aa6bdba0ca [mips] Add disassembly support for comparison instructions
llvm-svn: 332340
2018-05-15 11:18:24 +00:00
Simon Dardis
b79ecec20d [mips] Fix predicates of mfc1, mtc1 instructions
Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46692

llvm-svn: 332339
2018-05-15 11:10:30 +00:00
Artur Gainullin
243a3d56d8 [X86] Improve unsigned saturation downconvert detection.
Summary:
New unsigned saturation downconvert patterns detection was implemented in
X86 Codegen:

(truncate (smin (smax (x, C1), C2)) to dest_type),
where C1 >= 0 and C2 is unsigned max of destination type.

(truncate (smax (smin (x, C2), C1)) to dest_type)
where C1 >= 0, C2 is unsigned max of destination type and C1 <= C2.
These two patterns are equivalent to:

(truncate (umin (smax(x, C1), unsigned_max_of_dest_type)) to dest_type)

Reviewers: RKSimon

Subscribers: llvm-commits, a.elovikov

Differential Revision: https://reviews.llvm.org/D45315

llvm-svn: 332336
2018-05-15 10:24:12 +00:00
Shiva Chen
3969425081 [RISCV] Define FeatureRelax and shouldForceRelocation for RISCV linker relaxation
1. Deine FeatureRelax to enable/disable linker relaxation.

2. Define shouldForceRelocation to preserve relocation types even if the fixup
   can be resolved when linker relaxation enabled. This is necessary for
   correctness as offsets may change during relaxation.

Differential Revision: https://reviews.llvm.org/D46674

llvm-svn: 332318
2018-05-15 01:28:50 +00:00
Martin Storsjo
ace7ae935f [ARM] Back up R4 and LR if calling the stack probe function
Differential Revision: https://reviews.llvm.org/D46777

llvm-svn: 332298
2018-05-14 21:32:52 +00:00
Krzysztof Parzyszek
44e180ba89 [Hexagon] Add a target feature to control using small data section
llvm-svn: 332292
2018-05-14 21:01:56 +00:00
Krzysztof Parzyszek
f66f7612bf [Hexagon] Add a target feature for generating new-value stores
llvm-svn: 332290
2018-05-14 20:41:04 +00:00
Krzysztof Parzyszek
771f2422d0 [Hexagon] Add a target feature for memop generation
llvm-svn: 332285
2018-05-14 20:09:07 +00:00
Simon Pilgrim
215ce4a1ca [X86] Add NT load/store scheduler classes
llvm-svn: 332274
2018-05-14 18:37:19 +00:00
Craig Topper
53ceb4805f [X86] Remove and autoupgrade avx512.vbroadcast.ss/avx512.vbroadcast.sd intrinsics.
llvm-svn: 332271
2018-05-14 18:21:22 +00:00
Simon Pilgrim
228d24a2d6 [X86][BtVer2] Fix MMX/YMM integer vector nt store schedules
MMX was missing and YMM was tagged as a fp nt store

llvm-svn: 332269
2018-05-14 18:07:28 +00:00
Krzysztof Parzyszek
329c3e9a5f [Hexagon] Avoid predicate copies to integer registers from store-locked
llvm-svn: 332260
2018-05-14 16:41:40 +00:00
Simon Dardis
bb818b4421 [mips] Fix the predicates of round, ceiling, floor and trunc.
Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46691

llvm-svn: 332258
2018-05-14 16:26:50 +00:00
Zaara Syeda
421a5960d2 [NFC] [Power] Fix instruction format for xsrqpi
xsrqpi is currently using Z23Form_1.
The instruction format is xsrqpi R,VRT,VRB,RMC.
Rathar than bits 11-15 being used for FRA, it should have
bits 11-14 reserved and bit 15 for R. This patch adds a new
class Z23Form_4 to fix the instruction format.

Differential Revision: https://reviews.llvm.org/D46761

llvm-svn: 332253
2018-05-14 15:45:15 +00:00
Evandro Menezes
14fa2e4fa5 [AArch64] Improve single vector lane stores
When storing the 0th lane of a vector, use a simpler and usually more efficient scalar store instead.

Differential revision: https://reviews.llvm.org/D46655

llvm-svn: 332251
2018-05-14 15:26:35 +00:00
Nicola Zaghen
d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Sander de Smalen
93380371bb [AArch64][SVE] Extend parsing of Prefetch operation for SVE.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46681

llvm-svn: 332234
2018-05-14 11:54:41 +00:00
Simon Dardis
fba0362096 [mips] Correct the predicates of indexed floating point stores and loads.
Also, fix the register class for microMIPS.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46689

llvm-svn: 332227
2018-05-14 10:53:15 +00:00
Craig Topper
266b7ae55d [X86] Cleanup a multiclass that doesn't need as many parameters after recent intrinsic removals.
llvm-svn: 332207
2018-05-14 00:17:52 +00:00
Craig Topper
0e71c6d5ca [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead.
llvm-svn: 332206
2018-05-14 00:06:49 +00:00
Craig Topper
97e74b05ef [X86] Add patterns for combining movss+uint_to_fp into the intrinsic instructions under AVX512.
This matches what we do for sint_to_fp.

llvm-svn: 332205
2018-05-13 23:24:21 +00:00
Craig Topper
85906cf041 [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.
llvm-svn: 332198
2018-05-13 18:03:59 +00:00
Matt Arsenault
432aaea63f AMDGPU: Rename OpenCL lowering pass to be R600 specific.
This pass is
  a) broken.
  b) r600 specific.

Fixing (a) is a bit more non-trivial, but fixing (b)
is easy. Move this pass to being R600 only for now.

This pass does pass all the unit tests, however clang
no longer generates code that looks like the unit test
input, so fixing the pass requires fixing the tests and
the pass as one, and checking it works with clang still.

Patch by Dave Airlie

llvm-svn: 332196
2018-05-13 10:04:48 +00:00
Matt Arsenault
dfb88dfe30 AMDGPU: Make undef legal for v2i16/v2f16
This is apparently necessary to stop undef from being
turned into a build_vector of 0s.

llvm-svn: 332195
2018-05-13 10:04:38 +00:00
Craig Topper
38b713d4a7 [X86] Add some load folding patterns for cvtsi2ss/sd into intrinsic instructions.
llvm-svn: 332189
2018-05-13 01:54:33 +00:00
Craig Topper
df3a9cedff [X86] Remove an autoupgrade legacy cvtss2sd intrinsics.
llvm-svn: 332187
2018-05-13 00:29:40 +00:00
Craig Topper
38ad7ddabc [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time.
llvm-svn: 332186
2018-05-12 23:14:39 +00:00
Chandler Carruth
095d69507e [x86] Remove a comment obviated by r330269. Should have deleted the
comment in the same revision but missed it.

Thanks to Dimitry Andric for catching this!

llvm-svn: 332177
2018-05-12 21:28:53 +00:00