llvm-project

Author	SHA1	Message	Date
Nirav Dave	925b64be64	[X86] Correctly use SSE registers if no-x87 is selected. Fix use of SSE1 registers for f32 ops in no-x87 mode. Notably, allow use of SSE instructions for f32 operations in 64-bit mode (but not 32-bit which is disallowed by callign convention). Also avoid translating memset/memcopy/memmove into SSE registers without X87 for 32-bit mode. This fixes PR38738. Reviewers: nickdesaulniers, craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D52555 llvm-svn: 343689	2018-10-03 14:13:30 +00:00
Alex Bradbury	efceb59801	[RISCV] Remove RV64 test lines from umulo-128-legalisation-lowering.ll The generated code is incorrect anyway, and this test adds noise to the upcoming set of patches that flesh out RV64 support. llvm-svn: 343675	2018-10-03 10:59:42 +00:00
Tim Renouf	a37679d67b	[AMDGPU] Fix for negative offsets in buffer/tbuffer intrinsics Summary: The new buffer/tbuffer intrinsics handle an out-of-range immediate offset by moving/adding offset&-4096 to a vgpr, leaving an in-range immediate offset, with a chance of the move/add being CSEd for similar loads/stores. However it turns out that a negative offset in a vgpr is illegal, even if adding the immediate offset makes it legal again. Therefore, this commit disables the offset&-4096 thing if the offset is negative. Differential Revision: https://reviews.llvm.org/D52683 Change-Id: Ie02f0a74f240a138dc2a29d17cfbd9e350e4ed13 llvm-svn: 343672	2018-10-03 10:29:43 +00:00
Fangrui Song	3d76d36059	[AMDGPU] Rename pass "isel" to "amdgpu-isel" Summary: The AMDGPU target specific pass "isel" is a misleading name. Reviewers: tstellar, echristo, javed.absar, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D52759 llvm-svn: 343659	2018-10-03 03:38:22 +00:00
Daniel Sanders	bad3936109	[globalisel] Fix one more missing Verifier pass from gisel-commandline-option.ll llvm-svn: 343658	2018-10-03 02:52:54 +00:00
Matt Arsenault	635d479322	AMDGPU: Always run AMDGPUAlwaysInline Even if calls are enabled, it still needs to be run for forcing inline of functions that use LDS. llvm-svn: 343657	2018-10-03 02:47:25 +00:00
Daniel Sanders	34eac35a60	Add the missing new files from r343654 llvm-svn: 343655	2018-10-03 02:21:30 +00:00
Daniel Sanders	c973ad1878	Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 The previous commit failed portions of the test-suite on GreenDragon due to duplicate COPY instructions and iterator invalidation. Both issues have now been fixed. To assist with this, a helper (cloneVirtualRegister) has been added to MachineRegisterInfo that can be used to get another register that has the same type and class/bank as an existing one. llvm-svn: 343654	2018-10-03 02:12:17 +00:00
Thomas Lively	9075cd607d	[WebAssembly] any_true and all_true intrinsics and instructions Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52755 llvm-svn: 343649	2018-10-03 00:19:39 +00:00
Sam Clegg	b2486f118d	[WebAssembly] Stop generating helper functions in WebAssemblyLowerEmscriptenEHSjLj Previously we were creating weakly defined helper function in each translation unit: - setThrew - setTempRet0 Instead we now assume these will be provided at link time. In emscripten they are provided in compiler-rt: https://github.com/kripken/emscripten/pull/7203 Additionally we previously created three global variable which are also now required to exist at link time instead. - __THREW__ - _threwValue - __tempRet0 Differential Revision: https://reviews.llvm.org/D49208 llvm-svn: 343640	2018-10-02 22:12:15 +00:00
Daniel Sanders	f430d941e9	[globalisel] Attempt to fix llvm-clang-x86_64-expensive-checks-win The behaviour of this bot indicates that -verify-machineinstrs has been forced on and is therefore inserting the verifier on builds that don't expect it. Explicitly specify whether it's enabled or disabled for each test. llvm-svn: 343633	2018-10-02 20:51:27 +00:00
Matt Morehouse	4b1ec17fb0	Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions" This reverts r343520 due to breakage of HWASan tests on Android. llvm-svn: 343616	2018-10-02 18:35:44 +00:00
Krzysztof Parzyszek	528aff3372	[Hexagon] Fix extracting subvectors of non-HVX vNi1 Patch by Brendon Cahoon. llvm-svn: 343596	2018-10-02 15:05:43 +00:00
Roman Lebedev	ea2046bea9	[NFC][CodeGen][X86] fma.ll, lwp-intrinsics.ll: actually spell --check-prefixes correctly :/ llvm-svn: 343588	2018-10-02 13:34:50 +00:00
Roman Lebedev	5412be4b7a	[NFC][CodeGen][X86] lwp-intrinsics.ll: fix check prefixes llvm-svn: 343585	2018-10-02 13:11:08 +00:00
Roman Lebedev	8b253f0b54	[NFC][CodeGen][X86] fma.ll: fix check prefixes for -mcpu=bdver2 llvm-svn: 343584	2018-10-02 13:10:55 +00:00
Simon Pilgrim	ad23f270db	[X86] Standardize floating point assembly comments Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well. Differential Revision: https://reviews.llvm.org/D52702 llvm-svn: 343562	2018-10-02 09:08:51 +00:00
Matt Arsenault	ab41193312	AMDGPU: Expand atomicrmw nand in IR llvm-svn: 343559	2018-10-02 03:50:56 +00:00
Thomas Lively	6f77811a21	[WebAssembly] Restore slashes in SIMD conversion names Summary: Depends on D52372 and D52442. Reviewers: aheejin, dschuff, aardappel Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52512 llvm-svn: 343558	2018-10-02 01:52:21 +00:00
Fangrui Song	99d4f74d01	[AArch64][DAGCombiner]: change -stop-after=isel to instruction-select "isel" is registered by AMDGPU. The test will break if the AMDGPU target is not built. llvm-svn: 343553	2018-10-02 00:22:51 +00:00
Daniel Sanders	33f42f97af	Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 There's a strange assertion on two of the Green Dragon bots that goes away when this is reverted. The assertion is in RegBankAlloc and if it is this commit then -verify-machine-instrs should have caught it earlier in the pipeline. llvm-svn: 343546	2018-10-01 22:32:08 +00:00
Reid Kleckner	9ea2c01264	[codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL Summary: Before this change, LLVM would always describe locals on the stack as being relative to some specific register, RSP, ESP, EBP, ESI, etc. Variables in stack memory are pretty common, so there is a special S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to reduce the size of our debug info. On top of the size savings, there are cases on 32-bit x86 where local variables are addressed from ESP, but ESP changes across the function. Unlike in DWARF, there is no FPO data to describe the stack adjustments made to push arguments onto the stack and pop them off after the call, which makes it hard for the debugger to find the local variables in frames further up the stack. To handle this, CodeView has a special VFRAME register, which corresponds to the $T0 variable set by our FPO data in 32-bit. Offsets to local variables are instead relative to this value. This is part of PR38857. Reviewers: hans, zturner, javed.absar Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D52217 llvm-svn: 343543	2018-10-01 21:59:45 +00:00
Craig Topper	42cd8cd862	Recommit r343499 "[X86] Enable load folding in the test shrinking code" Original message: This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669 llvm-svn: 343540	2018-10-01 21:35:28 +00:00
Craig Topper	f06a57fc89	Recommit r343498 "[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated." This includes a fix to prevent i16 compares with i32/i64 ands from being shrunk if bit 15 of the and is set and the sign bit is used. Original commit message: Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive. It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch. llvm-svn: 343539	2018-10-01 21:35:26 +00:00
Stefan Pintilie	5d32a86f44	[PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads. Going from XForm Load to DSForm Load requires that the immediate be 4 byte aligned. If we are not aligned we must leave the load as LDX (XForm). This bug is causing a compile-time failure in the benchmark h264ref. Differential Revision: https://reviews.llvm.org/D51988 llvm-svn: 343525	2018-10-01 20:16:27 +00:00
Daniel Sanders	9659bfda5a	[globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 llvm-svn: 343521	2018-10-01 18:56:47 +00:00
Matthias Braun	3e081703c3	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343520	2018-10-01 18:56:39 +00:00
Craig Topper	1346b5b7cf	[X86] Add more test shrinking with truncate and sign bit usage tests. NFC llvm-svn: 343519	2018-10-01 18:52:19 +00:00
Craig Topper	e072934d28	Revert r343499 and r343498. X86 test improvements There's a subtle bug in the handling of truncate from i32/i64 to i32 without minsize. I'll be adding more test cases and trying to find a fix. llvm-svn: 343516	2018-10-01 18:40:44 +00:00
Krzysztof Parzyszek	6d569a2cc4	[Hexagon] Remove incorrect pattern for swiz The pattern had a couple of problems: - It was checking for loads of bytes in the reverse order to what it should have been looking for. - It would replace loads of bytes with a load of a word without making sure that the alignment was correct. Thanks to Eli Friedman for pointing it out. llvm-svn: 343514	2018-10-01 18:24:40 +00:00
Matthias Braun	7159daa68e	MIRParser: Check that instructions only reference DILocation metadata llvm-svn: 343505	2018-10-01 17:50:52 +00:00
Craig Topper	aa84e1bba2	[X86] Enable load folding in the test shrinking code This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669 Differential Revision: https://reviews.llvm.org/D52699 llvm-svn: 343499	2018-10-01 17:10:50 +00:00
Craig Topper	2b587ad071	[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive. It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign flag needs to be unused. There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch. Differential Revision: https://reviews.llvm.org/D52669 llvm-svn: 343498	2018-10-01 17:10:45 +00:00
Simon Pilgrim	e0d2019052	[X86][Btver2] Fix BT(C\|R\|S)mr & BT(C\|R\|S)mi schedule latency + uop counts Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343494	2018-10-01 16:31:30 +00:00
Matthias Braun	004fe6bf83	DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types This fixes a case of bad index calculation when merging mismatching vector types. This changes the existing code to just use the existing extract_{subvector\|element} and a bitcast (instead of bitcast first and then newly created extract_xxx) so we don't need to adjust any indices in the first place. rdar://44584718 Differential Revision: https://reviews.llvm.org/D52681 llvm-svn: 343493	2018-10-01 16:25:50 +00:00
Sanjay Patel	5187efcfab	[x86] add tests for 256- and 512-bit vector types for scalar-to-vector transform; NFC llvm-svn: 343491	2018-10-01 16:17:18 +00:00
Simon Atanasyan	1ea206be73	[mips] Generate tests expectations using update_llc_test_checks. NFC Generate tests expectations using update_llc_test_checks and reduce number of "check prefixes" used in the tests. llvm-svn: 343485	2018-10-01 14:43:07 +00:00
Clement Courbet	a933fb237e	[X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB. Summary: While looking at PR35606, I found out that the scheduling info is incorrect. One can check that it's really a P5+P6 and not a 2*P56 with: echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' \| ./bin/llvm-exegesis -mode=uops -snippets-file=- (vandps executes on P5 only) Reviewers: craig.topper, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52541 llvm-svn: 343447	2018-10-01 08:37:48 +00:00
Carlos Alberto Enciso	81d8ef2196	[DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal. When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed. It causes the debugger to display wrong variable value. Differential Revision: https://reviews.llvm.org/D52614 llvm-svn: 343445	2018-10-01 08:14:44 +00:00
Clement Courbet	ce4caff0de	[CodeGen][NFC] Add tests for heterogeneous types in MergeConsecutiveStores Reviewers: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52643 llvm-svn: 343444	2018-10-01 07:16:22 +00:00
Craig Topper	67d9dbdbdd	[X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers. We can only copy between a k-register and a GR32/GR64 register. This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure. This probably isn't the best fix, and we should probably figure out how to handle this correctly. Fixes PR38803. llvm-svn: 343443	2018-10-01 07:08:41 +00:00
Simon Pilgrim	f21083870d	[X86] Fix scheduler class for BTmi instructions This wasn't treated as a folded load instruction llvm-svn: 343424	2018-09-30 20:19:16 +00:00
Bjorn Pettersson	c2fc53ac90	[PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF Summary: The lowering of PHI nodes used to detect if all inputs originated from IMPLICIT_DEF's. If so the PHI node was replaced by an IMPLICIT_DEF. Now we also consider undef uses when checking the inputs. So if all inputs are implicitly defined or undef we lower the PHI to an IMPLICIT_DEF. This makes PHIElimination::LowerPHINode more consistent as it checks both implicit and undef properties at later stages. Reviewers: MatzeB, tstellar Reviewed By: MatzeB Subscribers: jvesely, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D52558 llvm-svn: 343417	2018-09-30 17:26:58 +00:00
Bjorn Pettersson	4af7f57bdf	[PHIElimination] Update the regression test for PR16508 Summary: When PR16508 was solved (in rL185363) a regression test was added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll. I discovered that the test case no longer reproduced the scenario from PR16508. This problem could have been amended by adding an extra RUN line with "-O1" (or possibly "-O0"), but instead I added a mir-reproducer test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir to get a reproducer that is less sensitive to changes in earlier passes (including O-level). While being at it I also corrected a code comment in PHIElimination::EliminatePHINodes that has been incorrect since the related bugfix from rL185363. Reviewers: MatzeB, hfinkel Reviewed By: MatzeB Subscribers: nemanjai, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52553 llvm-svn: 343416	2018-09-30 17:23:21 +00:00
Roman Lebedev	0496477c5d	[NFC][CodeGen][X86][AArch64] Add 64-bit constant bit field extract pattern tests llvm-svn: 343404	2018-09-30 12:42:08 +00:00
Simon Pilgrim	84e280ae42	[X86] Regenerate MMX coalescing test Exposes another extractelement(bitcast(scalartovector())) pattern llvm-svn: 343403	2018-09-30 09:42:04 +00:00
Craig Topper	1709829fed	[X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're on compiling for a CPU with single uop BEXTR Summary: This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction. The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop. For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case. Reviewers: RKSimon, spatel, lebedev.ri, andreadb Reviewed By: RKSimon Subscribers: llvm-commits, andreadb, RKSimon Differential Revision: https://reviews.llvm.org/D52570 llvm-svn: 343399	2018-09-30 03:01:46 +00:00
David Bolvansky	09fd8172df	[DAGCombiner][NFC] Tests for X div/rem Y single bit fold llvm-svn: 343392	2018-09-29 21:00:37 +00:00
Simon Pilgrim	c4e7c347cd	[X86][AVX2] Cleanup shuffle combining tests - add common prefixes llvm-svn: 343391	2018-09-29 20:34:16 +00:00
Simon Pilgrim	a2efe82b81	[X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target shuffles before simplifying inputs By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....). llvm-svn: 343390	2018-09-29 18:15:26 +00:00

1 2 3 4 5 ...

26040 Commits