llvm-project

Author	SHA1	Message	Date
Zheng Chen	04377a81ae	[Powerpc] set instruction count as lsr first priority of lsr. On Powerpc, set instruction count as lsr first priority of lsr by default. Add an option ppc-lsr-no-insns-cost to return back to default lsr cost model. Reviewed By: steven.zhang, jsji Differential Revision: https://reviews.llvm.org/D72683	2020-02-16 21:04:55 -05:00
Craig Topper	20c5968e09	[X86] Increase latency of port5 masked compares and kshift/kadd/kunpck instructions in SKX scheduler model Uops.info shows these as 4 cycle latency.	2020-02-16 16:59:37 -08:00
Craig Topper	272d35aef5	[X86] Separate floating point handling out of EmitCmp and emitFlagsForSetcc. Both of those functions only have a single caller starting at LowerSETCC. Just handle floating point directly in LowerSETCC. This removes the need to pass Chain and IsSignaling all the way down.	2020-02-16 10:51:05 -08:00
Craig Topper	d26f11108b	[X86] Split X86ISD::CMP into an integer and FP opcode.	2020-02-16 10:10:19 -08:00
Eric Astor	ee2c0f76d7	[ms] [llvm-ml] Add a draft MASM parser Summary: Many directives are unavailable, and support for others may be limited. This first draft has preliminary support for: - conditional directives (including errors), - data allocation (unsigned types up to 8 bytes, and ALIGN), - equates/variables (numeric and text), - and procedure directives (without parameters), as well as COMMENT, ECHO, INCLUDE, INCLUDELIB, PUBLIC, and EXTERN. Text variables (aka text macros) are expanded in-place wherever the identifier occurs. We deliberately ignore all ml.exe processor directives. Prominent features not yet supported: - structs - macros (both procedures and functions) - procedures (with specified parameters) - substitution & expansion operators Conditional directives are complicated by the fact that "ifdef rax" is a valid way to check if a file is being assembled for a 64-bit x86 processor; we add support for "ifdef <register>" in general, which requires adding a tryParseRegister method to all MCTargetAsmParsers. (Some targets require backtracking in the non-register case.) Reviewers: rnk, thakis Reviewed By: thakis Subscribers: kerbowa, merge_guards_bot, wuzish, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, mgorny, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72680	2020-02-16 12:30:46 -05:00
Nikita Popov	7c362b25d7	[IRBuilder] Fix unnecessary IRBuilder copies; NFC Fix a few cases where an IRBuilder is passed to a helper function by value, while a by reference pass was intended.	2020-02-16 17:57:18 +01:00
Simon Pilgrim	b85df2e185	[X86] combineX86ShuffleChain - add support for combining 512-bit shuffles to PALIGNR	2020-02-16 16:13:26 +00:00
Simon Pilgrim	c9c1c2b335	[X86] combineX86ShuffleChain - add support for combining 512-bit shuffles to bit shifts	2020-02-16 16:13:25 +00:00
Sanjay Patel	e48b536be6	[x86] form broadcast of scalar memop even with >1 use The unseen logic diff occurs because MayFoldLoad() is defined like this: static bool MayFoldLoad(SDValue Op) { return Op.hasOneUse() && ISD::isNormalLoad(Op.getNode()); } The test diffs here all seem ok to me on screen/paper, but it's hard to know if that will lead to universally better perf for all targets. For example, if a target implements broadcast from mem as multiple uops, we would have to weigh the potential reduction of instructions and register pressure vs. possible increase in number of uops. I don't know if we can make a truly informed decision on this at compile-time. The motivating case that I'm looking at in PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 ...resembles the diff in extract-concat.ll, but we're not going to change the larger example there without at least 1 other fix. Differential Revision: https://reviews.llvm.org/D74088	2020-02-16 10:32:56 -05:00
Fangrui Song	46788a21f9	[X86][AsmPrinter] PrintSymbolOperand: prefer to lower ELF MO_GlobalAddress to .Lfoo$local	2020-02-15 13:45:29 -08:00
Craig Topper	e5b3ae4b34	[X86] Merge two switches together to simplify some code. NFC	2020-02-15 12:55:51 -08:00
Craig Topper	c3c20c83f3	[X86] Fix typo in comment. NFC	2020-02-15 12:48:19 -08:00
Simon Pilgrim	34a054ce71	[X86] combineX86ShuffleChain - add support for combining to X86ISD::ROTLI Refactors matchShuffleAsBitRotate to allow use by both lowerShuffleAsBitRotate and matchUnaryPermuteShuffle.	2020-02-15 20:04:54 +00:00
Craig Topper	3f7649799b	[X86] Move combineIncDecVector logic from Select to PreprocessISelDAG. This allows it to work properly with masked inc/dec for avx512. Those would have a vselect as the root node so didn't get a chance to call combineIncDecVector. This also simplifies the logic because we don't have to manage the topological ordering.	2020-02-15 09:59:12 -08:00
Fangrui Song	549b436beb	[MC] De-capitalize MCStreamer::Emit{Bundle,Addrsig}* etc So far, all non-COFF-related Emit* functions have been de-capitalized.	2020-02-15 09:11:48 -08:00
Pavel Iliin	dc0b815989	[AArch64][FIX] Correct register live range during pseudo expansion. This commit fixes the broken tests after commit b6a9fe209992789be3ed95664d25196361cfad34 on the expensive check builder: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/2884	2020-02-15 12:16:56 +00:00
David Green	da147ef0a5	[AArch64] Fixup kill flags on BSL generation This hopefully fixes up the expensive checks bot.	2020-02-15 11:44:23 +00:00
Fangrui Song	774971030d	[MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex}	2020-02-14 23:08:40 -08:00
Fangrui Song	1dc16c752d	[MC] Add MCSection::NonUniqueID and delete one MCContext::getELFSection overload	2020-02-14 20:25:52 -08:00
Matt Arsenault	8d8d46b57a	AMDGPU/GlobalISel: Fix missing impdef of scc on boolean bit ops	2020-02-14 22:35:30 -05:00
Fangrui Song	6d2d589b06	[MC] De-capitalize another set of MCStreamer::Emit* functions Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc	2020-02-14 19:26:52 -08:00
Fangrui Song	a55daa1461	[MC] De-capitalize some MCStreamer::Emit* functions	2020-02-14 19:11:53 -08:00
Shiva Chen	1cae2f9d19	[RISCV] Correct the CallPreservedMask for the function call in an interrupt handler CallPreservedMask is used to describe the register liveness after a function call. The function call in an interrupt handler should use the same CallPreservedMask as normal functions. So that only callee save registers can live through the function call.	2020-02-15 09:14:04 +08:00
Matt Arsenault	65dbdc329f	AMDGPU: Don't preserve analyses with div64 IR expansion The dominator tree needs to be updated, but that isn't handled now.	2020-02-14 20:06:02 -05:00
Matt Arsenault	dc3e499dd4	AMDGPU/GlobalISel: Fix G_EXTRACT of 96-bit results This would assert on an unhandled size in getRegSplitParts.	2020-02-14 15:57:40 -08:00
Matt Arsenault	60fea2713d	AMDGPU/GlobalISel: Improve 16-bit bswap Match the new DAG behavior and use v_perm_b32 when available. Also does better on SI/CI by expanding 16-bit swaps. Also fix non-power-of-2 cases.	2020-02-14 15:57:39 -08:00
Stanislav Mekhanoshin	922197d664	[TBLGEN] Allow to override RC weight Differential Revision: https://reviews.llvm.org/D74509	2020-02-14 15:49:52 -08:00
Craig Topper	8dc659c131	[Hexagon] Add an explicit makeArrayRef to pacify gcc 5.5 The array seemed to have decayed to a pointer before the ArrayRef constructor got called so there was no size information available.	2020-02-14 13:51:39 -08:00
Austin Kerbow	07824e65bf	[AMDGPU] Always enable XNACK feature when support is explicitly requested Differential Revision: https://reviews.llvm.org/D74630	2020-02-14 11:58:58 -08:00
Matt Arsenault	9ec668606b	AMDGPU: Add option to disable CGP division expansion The division expansions in AMDGPUCodeGenPrepare can't be relied on for correctness, since they punt to later optimization and possibly legalization in some cases. We still need a way to be able to write tests for the legalizer versions of the expansion. This is mostly for GlobalISel, since the expected optimzations is expecting aren't implemented. The interaction with the flag to expand 64-bit division in the IR is pretty confusing, but these flags have different purposes.	2020-02-14 11:37:07 -08:00
Matt Arsenault	34d9a16e54	AMDGPU: Add option to expand 64-bit integer division in IR I didn't realize we were already expanding 24/32-bit division here already. Use the available IntegerDivision utilities. This uses loops, so produces significantly smaller code than the inline DAG expansion. This now requires width reductions of 64-bit divisions before introducing the expanded loops. This helps work around missing legalization in GlobalISel for division, which are the only remaining core instructions that didn't work at all. I think this is plausibly a better implementation than exists in the DAG, although turning it on by default misses out on the constant value optimizations and also needs benchmarking.	2020-02-14 11:16:08 -08:00
Craig Topper	391cc4dd41	[X86] Use ZERO_EXTEND instead of SIGN_EXTEND in the fast isel handling of convert_from_fp16.	2020-02-14 10:57:12 -08:00
Craig Topper	fc0c72b2df	[X86] Add AVX512 support to the fast isel code for Intrinsic::convert_from_fp16/convert_to_fp16.	2020-02-14 10:57:11 -08:00
Matt Arsenault	bfbfa18591	GlobalISel: Lower s64->s16 G_FPTRUNC This is more or less directly ported from the AMDGPU custom lowering for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES instead of creating shift/trunc to extract the two halves, and zexting an inverted compare instead of select_cc). This also does not include the fast math expansion the DAG which converts to f32 and then to f16. I think that belongs in a pre-legalize combine instead.	2020-02-14 10:46:58 -08:00
Volkan Keles	187686a22f	[GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges Like COPY instructions explained in D70616, we don't check the constraints when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check if registers can be replaced, or a COPY instruction needs to be built. https://reviews.llvm.org/D70564	2020-02-14 10:45:58 -08:00
Brian Cain	bf3b86bc2f	[Hexagon] v67+ HVX register pairs should support either direction Assembler now permits pairs like 'v0:1', which are encoded differently from the odd-first pairs like 'v1:0'. The compiler will require more work to leverage these new register pairs.	2020-02-14 12:43:43 -06:00
Matt Arsenault	8c2c0b3637	AMDGPU: Improve i16/v2i16 bswap	2020-02-14 09:53:22 -08:00
Craig Topper	7badb38918	[X86] Fix copy/paste mistake in comment. NFC	2020-02-14 09:47:50 -08:00
Matt Arsenault	a257bde420	AMDGPU/GlobalISel: Handle G_BSWAP	2020-02-14 09:09:44 -08:00
Pavel Iliin	b6a9fe2099	[AArch64] Add BIT/BIF support. This patch added generation of SIMD bitwise insert BIT/BIF instructions. In the absence of GCC-like functionality for optimal constraints satisfaction during register allocation the bitwise insert and select patterns are matched by pseudo bitwise select BSP instruction with not tied def. It is expanded later after register allocation with def tied to BSL/BIT/BIF depending on operands registers. This allows to get rid of redundant moves. Reviewers: t.p.northover, samparker, dmgreen Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D74147	2020-02-14 14:19:39 +00:00
Simon Pilgrim	2492075add	[X86][SSE] lowerShuffleAsBitRotate - lower to vXi8 shuffles to ROTL on pre-SSSE3 targets Without PSHUFB we are better using ROTL (expanding to OR(SHL,SRL)) than using the generic v16i8 shuffle lowering - but if we can widen to v8i16 or more then the existing shuffles are still the better option. REAPPLIED: Original commit rG11c16e71598d was reverted at rGde1d90299b16 as it wasn't accounting for later lowering. This version emits ROTLI or the OR(VSHLI/VSRLI) directly to avoid the issue.	2020-02-14 11:55:18 +00:00
Kazushi (Jam) Marukawa	60431bd728	[VE] Support for PIC (global data and calls) Summary: Support for PIC with tests for global variables and function calls. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D74536	2020-02-14 09:50:02 +01:00
Sam Parker	fd01b2f4a6	[NFC][ARM] Convert some pointers to references.	2020-02-14 08:29:01 +00:00
Fangrui Song	bcd24b2d43	[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*	2020-02-13 22:08:55 -08:00
Liu, Chen3	ec89335c47	[X86] Fix the bug that _mm_mask_cvtsepi64_epi32 generates result without zero the upper 64bit. Differential Revision : https://reviews.llvm.org/D74552	2020-02-14 09:26:06 +08:00
Fangrui Song	1d49eb00d9	[AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction Similar to rL328848.	2020-02-13 17:06:24 -08:00
Thomas Lively	918e90559b	[WebAssembly] Make stack pointer args inhibit tail calls Summary: Also make return calls terminator instructions so epilogues are inserted before them rather than after them. Together, these changes make WebAssembly's tail call optimization more stack-safe. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73943	2020-02-13 16:43:53 -08:00
Fangrui Song	0bc77a0f0d	[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions Similar to rL328848.	2020-02-13 13:38:33 -08:00
Craig Topper	c2e8a421ac	[X86] Don't widen 128/256-bit strict compares with vXi1 result to 512-bits on KNL. If we widen the compare we might trigger a spurious exception from the garbage data. We have two choices here. Explicitly force the upper bits to zero. Or use a legacy VEX vcmpps/pd instruction and convert the XMM/YMM result to mask register. I've chosen to go with the second option. I'm not sure which is really best. In some cases we could get rid of the zeroing since the producing instruction probably already zeroed it. But we lose the ability to fold a load. So which is best is dependent on surrounding code. Differential Revision: https://reviews.llvm.org/D74522	2020-02-13 13:26:40 -08:00
Fangrui Song	0dce409cee	[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile	2020-02-13 13:22:49 -08:00

1 2 3 4 5 ...

56095 Commits