51587 Commits

Author SHA1 Message Date
Craig Topper
55b0d987fd [X86] Use int64_t and isInt<N> instead of APInt operations in foldLoadStoreIntoMemOperand. NFC
We know all our values are limited to 64 bits here so we don't need an APInt.

This should save some generated code checking between large and small size.

llvm-svn: 358338
2019-04-13 18:57:41 +00:00
Heejin Ahn
5f3a04510a [WebAssembly] Use Function::hasOptSize() (NFC)
Summary: Use member function.

Reviewers: aheejin

Subscribers: sunfish, hiraditya, sbc100, jgravelle-google, dschuff, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60651

Patch by Hideto Ueno (uenoku)

llvm-svn: 358336
2019-04-13 16:54:39 +00:00
Amara Emerson
93e58d2396 [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash.
This enables the simple copy combine that already exists in the CombinerHelper.
However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a
set of MIs to process, and so would end up causing a crash when deleted MIs were
being added to the combiner worklist again.

Differential Revision: https://reviews.llvm.org/D60579

llvm-svn: 358318
2019-04-13 00:33:25 +00:00
Amara Emerson
2806fd01a1 [AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an undef mask element.
If a shufflevector's mask vector has an element with "undef" then the generic
instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT.
This fixes the selector to handle this case, and for now assumes that undef just means
zero. In future we'll optimize this case properly.

llvm-svn: 358312
2019-04-12 21:31:21 +00:00
Thomas Lively
9e27514996 [WebAssembly] Add mutable-globals to bleeding-edge CPU
Summary: This brings the backend in line with Clang.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60594

llvm-svn: 358310
2019-04-12 20:39:53 +00:00
Brendon Cahoon
4df216cd62 [Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass
The Hexagon Vector Loop Carried Reuse pass was allowing reuse between
two shufflevectors with different masks. The reason is that the masks
are not instruction objects, so the code that checks each operand
just skipped over the operands.

This patch fixes the bug by checking if the operands are the same
when they are not instruction objects. If the objects are not the
same, then the code assumes that reuse cannot occur.

Differential Revision: https://reviews.llvm.org/D60019

llvm-svn: 358292
2019-04-12 16:37:12 +00:00
Simon Pilgrim
6c8f4ada36 [X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns
Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions.

This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result.

This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation.

Differential Revision: https://reviews.llvm.org/D60610

llvm-svn: 358286
2019-04-12 14:22:57 +00:00
Kang Zhang
2446f843ae [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248

llvm-svn: 358271
2019-04-12 09:59:40 +00:00
Eric Christopher
6b06c6a5ef Add explicit dependencies on MCSection.h and MCDwarf.h to the .cpp
files rather than rely on transitive includes from MCStreamer.h.

llvm-svn: 358263
2019-04-12 07:40:01 +00:00
Eric Christopher
b6926bdcff Revert "[PowerPC] Add initialization for some ppc passes"
This reverts commit 6f8f98ce8de7c0e4ebd7fa2e1fd9507fe8d1c317 as it
is breaking nearly every bot.

llvm-svn: 358260
2019-04-12 07:16:58 +00:00
Kang Zhang
6f8f98ce8d [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248

llvm-svn: 358256
2019-04-12 06:35:15 +00:00
Eric Christopher
b6c190da23 Include what's used in a few cpp files - these were getting transitive
includes from MCDwarf.h.

llvm-svn: 358254
2019-04-12 06:16:33 +00:00
Zi Xuan Wu
ac79ef8f0e [PowerPC] More precise exploitation of P9 maddld instruction when operands are constant
There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes
they are constant. If there is constant operand, it takes extra li to 
materialize the operand, and one more extra register too. So it's not 
profitable to use maddld to optimize mul-add pattern.

Differential Revision: https://reviews.llvm.org/D60181

llvm-svn: 358253
2019-04-12 05:21:31 +00:00
Nick Desaulniers
8ec304c9fd [X86AsmPrinter] refactor static functions into private methods. NFC
Summary:
A lot of the code for printing special cases of operands in this
translation unit are static functions. While I too have suffered many
years of abuse at the hands of C, we should prefer private methods,
particularly when you start passing around *this as your first argument,
which is a code smell.

This will help make generic vs arch specific asm printing easier, as it
brings X86AsmPrinter more in line with other arch's derived AsmPrinters.
We will then be able to more easily move architecture generic code to
the base class, and architecture specific code to the derived classes.

Some other small refactorings while we're here:
- the parameter Op is now consistently OpNo
- add spaces around binary expressions. I know we're not millionaires
  but c'mon.

Reviewers: echristo

Reviewed By: echristo

Subscribers: smeenai, hiraditya, llvm-commits, srhines, craig.topper

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60577

llvm-svn: 358236
2019-04-11 22:47:13 +00:00
Amara Emerson
7e9355f870 [AArch64][GlobalISel] Flesh out vector load/store support for more types.
Some of these were legalizing into smaller vector types unnecessarily,
others were simply not supported yet.

llvm-svn: 358223
2019-04-11 20:40:01 +00:00
Amara Emerson
b956051415 [AArch64][GlobalISel] Legalization and ISel support for load/stores of vectors of pointers.
Loads and store of values with type like <2 x p0> currently don't get imported
because SelectionDAG has no knowledge of pointer types. To leverage the existing
support for vector load/stores, we can bitcast the value to have s64 element
types instead. We do this as a custom legalization.

This patch also adds support for general loads of <2 x s64>, and relaxes some
type conditions on selecting G_BITCAST.

Differential Revision: https://reviews.llvm.org/D60534

llvm-svn: 358221
2019-04-11 20:32:24 +00:00
Craig Topper
68a5d619a4 [X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre type legalization where the setcc result type is vXi1.
If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1.

The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand.

llvm-svn: 358218
2019-04-11 19:57:44 +00:00
Craig Topper
586fad50ac [X86] Add patterns for using movss/movsd for atomic load/store of f32/64. Remove atomic fadd pseudos use isel patterns instead.
This patch adds patterns for turning bitcasted atomic load/store into movss/sd.

It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part.

This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled.

Differential Revision: https://reviews.llvm.org/D60394

llvm-svn: 358215
2019-04-11 19:19:52 +00:00
Craig Topper
f7e548c076 Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2"
With correct test checks this time.

If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ

This matches what gcc and icc do for this case and removes an existing FIXME.

llvm-svn: 358214
2019-04-11 19:19:42 +00:00
Craig Topper
8200880c9a Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2"
I seem to have messed up the test checks.

llvm-svn: 358212
2019-04-11 19:04:38 +00:00
Craig Topper
1c2dfc3100 [X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2
If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness.

This matches what gcc and icc do for this case and removes an existing FIXME.

Differential Revision: https://reviews.llvm.org/D60156

llvm-svn: 358211
2019-04-11 18:40:21 +00:00
Simon Pilgrim
40b647ae8e [X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV3 mask support
Completes SimplifyDemandedVectorElts's basic variable shuffle mask support which should help D60512 + D60562 

llvm-svn: 358186
2019-04-11 15:29:15 +00:00
Roger Ferrer Ibanez
b621f04135 [RISCV] Diagnose invalid second input register operand when using %tprel_add
RISCVMCCodeEmitter::expandAddTPRel asserts that the second operand must be
x4/tp. As we are not currently checking this in the RISCVAsmParser, the assert
is easy to trigger due to wrong assembly input.

This patch does a late check of this constraint.

An alternative could be using a singleton register class for x4/tp similar to
the current one for sp. Unfortunately it does not result in a good diagnostic.
Because add is an overloaded mnemonic, if no matching is possible, the
diagnostic of the first failing alternative seems to be used as the diagnostic
itself. This means that this case the %tprel_add is diagnosed as an invalid
operand (because the real add instruction only has 3 operands).

Differential Revision: https://reviews.llvm.org/D60528

llvm-svn: 358183
2019-04-11 15:13:12 +00:00
Luo, Yuanke
a2b4d3fab6 [X86] Add MM register mapping from CodeView to MC register id
Differential Revision: https://reviews.llvm.org/D60437

Change-Id: I2183a6d825d0284b22705d423b88882992b236c5
llvm-svn: 358179
2019-04-11 15:01:03 +00:00
Simon Pilgrim
8a25154fa7 [X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV mask support
llvm-svn: 358174
2019-04-11 14:35:45 +00:00
Diogo N. Sampaio
8ddfd46c61 [AArch64] Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64
Summary:  Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64

Reviewers: pbarrio, DavidSpickett, LukeGeeson

Reviewed By: LukeGeeson

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60259

llvm-svn: 358171
2019-04-11 14:19:43 +00:00
Simon Pilgrim
6f3866c6fb [X86] SimplifyDemandedVectorElts - add X86ISD::VPERMILPV mask support
llvm-svn: 358170
2019-04-11 14:15:01 +00:00
Simon Pilgrim
cb5218ad48 [X86] SimplifyDemandedVectorElts - add X86ISD::VPERMIL2 mask support
llvm-svn: 358167
2019-04-11 14:04:19 +00:00
Simon Pilgrim
e468cc7f14 [X86] SimplifyDemandedVectorElts - add VPPERM support
We need to add support for all variable shuffle mask ops, but VPPERM is the only one that already has test coverage.

llvm-svn: 358165
2019-04-11 13:30:38 +00:00
Oliver Stannard
6fa145e429 Test commit access
llvm-svn: 358162
2019-04-11 12:53:33 +00:00
Shiva Chen
7cc03bd064 [RISCV] Put data smaller than eight bytes to small data section
Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset
could covert most of the small data section. Linker relaxation could transfer
the multiple data accessing instructions to a gp base with signed twelve-bit
offset instruction.

Differential Revision: https://reviews.llvm.org/D57493

llvm-svn: 358150
2019-04-11 04:59:13 +00:00
Amara Emerson
213e0bde04 [AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal.
The existing isel support already works for p0 once the legalizer accepts it.

llvm-svn: 358144
2019-04-10 23:06:14 +00:00
Amara Emerson
a7ff111b04 [AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD.
llvm-svn: 358143
2019-04-10 23:06:11 +00:00
Amara Emerson
ae878dab03 [AArch64][GlobalISel] Scalarize vector SDIV.
llvm-svn: 358142
2019-04-10 23:06:08 +00:00
Craig Topper
61f31cbcb2 [X86] Teach foldMaskedShiftToScaledMask to look through an any_extend from i32 to i64 between the and & shl
foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work.

This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part.

This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded.

Differential Revision: https://reviews.llvm.org/D60532

llvm-svn: 358139
2019-04-10 21:42:08 +00:00
Craig Topper
4a32ce39b7 [X86] Make _Int instructions the preferred instructon for the assembly parser and disassembly parser to remove inconsistencies between VEX and EVEX.
Many of our instructions have both a _Int form used by intrinsics and a form
used by other IR constructs. In the EVEX space the _Int versions usually cover
all the capabilities include broadcasting and rounding. While the other version
only covers simple register/register or register/load forms. For this reason
in EVEX, the non intrinsic form is usually marked isCodeGenOnly=1.

In the VEX encoding space we were less consistent, but usually the _Int version
was the isCodeGenOnly version.

This commit makes the VEX instructions match the EVEX instructions. This was
done by manually studying the AsmMatcher table so its possible I missed some
cases, but we should be closer now.

I'm thinking about using the isCodeGenOnly bit to simplify the EVEX2VEX
tablegen code that disambiguates the _Int and non _Int versions. Currently it
checks register class sizes and Record the memory operands come from. I have
some other changes I was looking into for D59266 that may break the memory check.

I had to make a few scheduler hacks to keep the _Int versions from being treated
differently than the non _Int version.

Differential Revision: https://reviews.llvm.org/D60441

llvm-svn: 358138
2019-04-10 21:29:41 +00:00
Craig Topper
87a8f9761e [X86] Replace some if statements in isel address matching that should never be true with asserts. And move them earlier before we looked through operands that don't change size. NFC
These ifs were ensuring we don't have to handle types larger than 64 bits probably because we use getZExtValue in several places below them.

None of the callers of this code pass types larger than 64-bits so we can just assert instead of branching in release code.

I've also moved them earlier since we're just looking through operations that don't effect bit width.

This is prep work for some refactoring I plan to do to the (and (shl)) handling code.

llvm-svn: 358123
2019-04-10 19:08:59 +00:00
Nick Desaulniers
ad8f3a1440 [X86AsmPrinter] refactor to limit use of Modifier. NFC
Summary:
The Modifier memory operands is used in 2 cases of memory references
(H & P ExtraCodes). Rather than pass around the likely nullptr Modifier,
refactor the handling of the Modifier out from printOperand().

The refactorings in this patch:
- Don't forward declare printOperand, move its definition up.
  - The diff makes it look like there's a change to printPCRelImm
    (narrator: there's not).
- Create printModifiedOperand()
  - Move logic for Modifier to there from printOperand
  - Use printModifiedOperand in 3 call sites that actually create
    Modifiers.
- Remove now unused Modifier parameter from printOperand
- Remove default parameter from printLeaMemReference as it only has 1
  call site that explicitly passes a parameter.
- Remove default parameter from printMemReference, make call lone call
  site explicitly pass nullptr.
- Drop Modifier parameter from printIntelMemReference, as Intel style
  memory references don't support the Modifiers in question.

This will allow future changes to printOperand() to make it a pure virtual
method on the base AsmPrinter class, allowing for more generic handling
of some architecture generic constraints. X86AsmPrinter was the only
derived class of AsmPrinter to have additional parameters on its
printOperand function.

Reviewers: craig.topper, echristo

Reviewed By: echristo

Subscribers: hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60526

llvm-svn: 358122
2019-04-10 19:01:44 +00:00
Roman Lebedev
dc67659ba5 [X86] X86ScheduleBdVer2: use !listsplat operator to cleanup loadres calculation
The problem is that one can't concatenate an empty list
(implied all-ones) with non-empty list here. The result
will be the non-empty list, and it won't match the length
of the ExePorts list.

The problems begin when LoadRes != 1 here,
which is the case in PdWriteResYMMPair,
and more importantly i think it will be the case for PdWriteResExPair.

llvm-svn: 358118
2019-04-10 18:26:42 +00:00
David Green
0861c87b06 Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg
Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not
seeing through to the constant in other blocks. Revert this patch while we come
up with a better way to handle that.

I will try to follow this up with some better tests.

llvm-svn: 358113
2019-04-10 18:00:41 +00:00
Craig Topper
35fe07916a [AArch64] Teach getTestBitOperand to look through ANY_EXTENDS
This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression.

Differential Revision: https://reviews.llvm.org/D60482

llvm-svn: 358108
2019-04-10 17:27:29 +00:00
Nick Desaulniers
5277b3ff25 [AsmPrinter] refactor to remove remove AsmVariant. NFC
Summary:
The InlineAsm::AsmDialect is only required for X86; no architecture
makes use of it and as such it gets passed around between arch-specific
and general code while being unused for all architectures but X86.

Since the AsmDialect is queried from a MachineInstr, which we also pass
around, remove the additional AsmDialect parameter and query for it deep
in the X86AsmPrinter only when needed/as late as possible.

This refactor should help later planned refactors to AsmPrinter, as this
difference in the X86AsmPrinter makes it harder to make AsmPrinter more
generic.

Reviewers: craig.topper

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60488

llvm-svn: 358101
2019-04-10 16:38:43 +00:00
Simon Pilgrim
37d8d55823 [X86][AVX] getTargetConstantBitsFromNode - extract bits from X86ISD::SUBV_BROADCAST
llvm-svn: 358096
2019-04-10 16:24:47 +00:00
Diogo N. Sampaio
aae424a2d2 [AArch64] Add lowering pattern for scalar fp16 facge and facgt
Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands.

Reviewers: olista01, javed.absar, pbarrio

Reviewed By: javed.absar

Subscribers: kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60212

llvm-svn: 358083
2019-04-10 13:34:18 +00:00
Diogo N. Sampaio
651463e4a8 [ARM] [FIX] Add missing f16 vector operations lowering
Summary:
Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node.
As well, allows <8xhalf> and <4xhalf> vldup1 operations.

These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics.

Reviewers: olista01, pbarrio, LukeGeeson, efriedma

Reviewed By: efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60319

llvm-svn: 358081
2019-04-10 13:28:06 +00:00
Clement Courbet
48e2eb0b27 [NFC] Fix unused variable warning.
llvm-svn: 358080
2019-04-10 13:18:05 +00:00
Diana Picus
6bdade85de Fixup r358063
Fix warning/error about mixed signedness.

llvm-svn: 358065
2019-04-10 09:31:28 +00:00
Diana Picus
4a7f8d8d6b [ARM GlobalISel] Add some asserts. NFC.
Make sure some arm opcodes don't unintentionally sneak into thumb mode.

llvm-svn: 358064
2019-04-10 09:14:37 +00:00
Diana Picus
b6e83b98f9 [ARM GlobalISel] Select G_FCONSTANT for VFP3
Make it possible to TableGen code for FCONSTS and FCONSTD.

We need to make two changes to the TableGen descriptions of vfp_f32imm
and vfp_f64imm respectively:
* add GISelPredicateCode to check that the immediate fits in 8 bits;
* extract the SDNodeXForms into separate definitions and create a
GISDNodeXFormEquiv and a custom renderer function for each of them.

There's a lot of boilerplate to get the actual value of the immediate,
but it basically just boils down to calling ARM_AM::getFP32Imm or
ARM_AM::getFP64Imm.

llvm-svn: 358063
2019-04-10 09:14:32 +00:00
Diana Picus
3533ad6801 [ARM GlobalISel] Select G_FCONSTANT into pools
Put all floating point constants into constant pools and load their
values from there.

llvm-svn: 358062
2019-04-10 09:14:24 +00:00