llvm-project

Author	SHA1	Message	Date
Craig Topper	d9fa8147c4	[X86] Autogenerate complete checks and fix a failure introduced in r337875. llvm-svn: 337889	2018-07-25 05:22:13 +00:00
Chandler Carruth	7024921c0a	[x86/SLH] Teach the x86 speculative load hardening pass to harden against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in any way mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878	2018-07-25 01:51:29 +00:00
Craig Topper	fc501a9223	[X86] Use a shift plus an lea for multiplying by a constant that is a power of 2 plus 2/4/8. The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2. llvm-svn: 337875	2018-07-25 01:15:38 +00:00
Craig Topper	5be253d988	[X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we do for pow2 - 2. llvm-svn: 337874	2018-07-25 01:15:35 +00:00
Craig Topper	56c104f104	[X86] Use a two lea sequence for multiply by 37, 41, and 73. These fit a pattern used by 11, 21, and 19. llvm-svn: 337871	2018-07-24 23:44:17 +00:00
Craig Topper	b5342b592e	[X86] Add test cases for multiply by 37, 41, and 73. These can all be handled with 2 LEAs similar to what we do for 11, 19, 21. llvm-svn: 337870	2018-07-24 23:44:15 +00:00
Craig Topper	f8fcee70a3	[X86] Change multiply by 26 to use two multiplies by 5 and an add instead of multiply by 3 and 9 and a subtract. Same number of operations, but ending in an add is friendlier due to it being commutable. llvm-svn: 337869	2018-07-24 23:44:12 +00:00
Craig Topper	5ddc0a2b14	[X86] When expanding a multiply by a negative of one less than a power of 2, like 31, don't generate a negate of a subtract that we'll never optimize. We generated a subtract for the power of 2 minus one then negated the result. The negate can be optimized away by swapping the subtract operands, but DAG combine doesn't know how to do that and we don't add any of the new nodes to the worklist anyway. This patch makes use explicitly emit the swapped subtract. llvm-svn: 337858	2018-07-24 21:31:21 +00:00
Craig Topper	6d29891bef	[X86] Generalize the multiply by 30 lowering to generic multipy by power 2 minus 2. Use a left shift and 2 subtracts like we do for 30. Move this out from behind the slow lea check since it doesn't even use an LEA. Use this for multiply by 14 as well. llvm-svn: 337856	2018-07-24 21:15:41 +00:00
Heejin Ahn	8daef0751d	[WebAssembly] Add tests for weaker memory consistency orderings Summary: Currently all wasm atomic memory access instructions are sequentially consistent, so even if LLVM IR specifies weaker orderings than that, we should upgrade them to sequential ordering and treat them in the same way. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49194 llvm-svn: 337854	2018-07-24 21:06:44 +00:00
Craig Topper	86d6320b94	[X86] Change multiply by 19 to use (9 * X) * 2 + X instead of (5 * X) * 4 - 1. The new lowering can be done in 2 LEAs. The old code took 1 LEA, 1 shift, and 1 sub. llvm-svn: 337851	2018-07-24 20:31:48 +00:00
Craig Topper	1296c622df	[X86] Add test case to show failure to combine away negates that may be created by mul by constant expansion. Mul by constant can expand to a sequence that ends with a negate. If the next instruction is an add or sub we might be able to fold the negate away. We currently fail to do this because we explicitly don't add anything to the DAG combine worklist when we expand multiplies. This is primarily to keep the multipy from being reformed, but we should consider adding the users to worklist. llvm-svn: 337843	2018-07-24 18:36:46 +00:00
Simon Atanasyan	28ded4ee19	[mips] Fix local dynamic TLS with Sym64 For the final DTPREL addition, rather than a lui/daddiu/daddu triple, LLVM was erronously emitting a daddiu/daddiu pair, treating the %dtprel_hi as if it were a %dtprel_lo, since Mips::Hi expands unshifted for Sym64. Instead, use a new TlsHi node and, although unnecessary due to the exact structure of the nodes emitted, use TlsHi for local exec too to prevent future bugs. Also garbage-collect the unused TprelLo and TlsGd nodes, and TprelHi since its functionality is provided by the new common TlsHi node. Patch by James Clarke. Differential revision: https://reviews.llvm.org/D49259 llvm-svn: 337827	2018-07-24 13:47:52 +00:00
Sam Parker	8b93e82c3d	[ARM] Disable ARMCodeGenPrepare by default ARM Stage 2 builders have been suspiciously broken since the pass was committed. Disabling to hopefully fix the bots and give me time to debug. llvm-svn: 337821	2018-07-24 12:04:23 +00:00
Chandler Carruth	a25aca21af	[x86] Clean up and convert test to use generated CHECK lines. This test was already checking microscopic behavior of tail call under specific conditions. This just makes the CHECK lines much more consistent, clear, and easily updated when intentional changes are made. I've also switched the test to consistently name the entry block and to order the helper declarations and comments for specific tests in the more usual locations. llvm-svn: 337806	2018-07-24 03:18:08 +00:00
Chandler Carruth	d41dca2ddc	[x86] Update the CHECK lines of this test to use the latest patterns from the script. This minimizes the diff in subsequent changes. llvm-svn: 337805	2018-07-24 03:07:07 +00:00
Tom Stellard	b7f19e6d1e	AMDGPU/GlobalISel: Legalize G_INSERT Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49601 llvm-svn: 337798	2018-07-24 02:19:20 +00:00
Thomas Anderson	8e8a652c2f	Fix typo in test/CodeGen/Mips/dins.ll Differential Revision: https://reviews.llvm.org/D49704 llvm-svn: 337771	2018-07-23 23:19:53 +00:00
Martin Storsjo	100fc97051	[COFF] Fix assembly output of comdat sections without an attached symbol Since SVN r335286, the .xdata sections are produced without an attached symbol, which requires using a different syntax when printing assembly output. Instead of the usual syntax of '.section <name>,"dr",discard,<symbol>', use '.section <name>,"dr"' + '.linkonce discard' (which is what GCC uses for all assembly output). This fixes PR38254. Differential Revision: https://reviews.llvm.org/D49651 llvm-svn: 337756	2018-07-23 22:15:19 +00:00
Martin Storsjo	c2b701408e	[AArch64] Use MCAsmInfoMicrosoft and MCAsmInfoGNUCOFF as base classes This matches the structure used on X86 and ARM. This requires a little bit of duplication of the parts that are equal in both AArch64 COFF variants though. Before SVN r335286, these classes didn't add anything that MCAsmInfoCOFF didn't, but now they do. This makes AArch64 match X86 in how comdat is used for float constants for MinGW. Differential Revision: https://reviews.llvm.org/D49637 llvm-svn: 337755	2018-07-23 22:15:14 +00:00
Reid Kleckner	980c4df037	Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code models" Don't try to generate large PIC code for non-ELF targets. Neither COFF nor MachO have relocations for large position independent code, and users have been using "large PIC" code models to JIT 64-bit code for a while now. With this change, if they are generating ELF code, their JITed code will truly be PIC, but if they target MachO or COFF, it will contain 64-bit immediates that directly reference external symbols. For a JIT, that's perfectly fine. llvm-svn: 337740	2018-07-23 21:14:35 +00:00
Nirav Dave	5af81d5bfa	Add inline asm aliasing test. llvm-svn: 337734	2018-07-23 20:19:10 +00:00
Krzysztof Parzyszek	9500a24fce	[Hexagon] Handle unnamed globals in HexagonConstExpr Instead of comparing names, compare positions in the parent module. llvm-svn: 337723	2018-07-23 18:30:17 +00:00
Simon Atanasyan	307e5b31ce	[mips] Add more checks to the tls.ll test case. NFC llvm-svn: 337705	2018-07-23 16:05:44 +00:00
Cameron McInally	2c9bcffc92	[FPEnv] Legalize double width StrictFP vector operations Differential Revision: https://reviews.llvm.org/D48809 llvm-svn: 337698	2018-07-23 14:40:17 +00:00
Sam Parker	3828c6ff94	[ARM] ARMCodeGenPrepare backend pass Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687	2018-07-23 12:27:47 +00:00
Chandler Carruth	1d926fb9f4	[x86/SLH] Fix a bug where we would harden tail calls twice -- once as a call, and then again as a return. Also added a comment to try and explain better why we would be doing what we're doing when hardening the (non-call) returns. llvm-svn: 337673	2018-07-23 07:56:15 +00:00
Chandler Carruth	b66f2d8df8	[x86/SLH] Add a test covering indirect forms of control flow. NFC. This specifically covers different ways of making indirect calls and jumps. There are some bugs in SLH that I will be fixing in subsequent patches where the diff in the generated instructions makes the bug fix much more clear, so just checking in a baseline of this test to start. I'm also going to be adding direct mitigation for variant 1.2 which this file very specifically tests in the various forms it can arise on x86. Again, the diff to the generated instructions should make the change for that much more clear, so having the test as a baseline seems useful. llvm-svn: 337672	2018-07-23 07:51:51 +00:00
Craig Topper	b2a626b52e	[X86] Remove the max vector width restriction from combineLoopMAddPattern and rely splitOpsAndApply to handle splitting. This seems to be a net improvement. There's still an issue under avx512f where we have a 512-bit vpaddd, but not vpmaddwd so we end up doing two 256-bit vpmaddwds and inserting the results before a 512-bit vpaddd. It might be better to do two 512-bits paddds with zeros in the upper half. Same number of instructions, but breaks a dependency. llvm-svn: 337656	2018-07-22 19:44:35 +00:00
Craig Topper	d8f80e90ce	[X86] Add more MADD recurrence test cases with larger and narrower vector widths. llvm-svn: 337650	2018-07-22 05:16:47 +00:00
Simon Atanasyan	ecd1e0afdd	[mips] Move out the WrapperPat declaration from the NotInMicroMips predicate This is a follow-up to the rL335185. Those commit adds some WrapperPat patterns for microMIPS target. But declaration of the WrapperPat class is under the NotInMicroMips predicate and microMIPS patterns cannot be selected because predicate (Subtarget->inMicroMipsMode()) && (!Subtarget->inMicroMipsMode()) is always false. This change move out the WrapperPat class declaration from the NotInMicroMips predicate and enables microMIPS WrapperPat patterns. Differential revision: https://reviews.llvm.org/D49533 llvm-svn: 337646	2018-07-21 16:16:03 +00:00
Krzysztof Parzyszek	05337bdb50	[Hexagon] Disable packets in test to avoid ordering issues in checks llvm-svn: 337624	2018-07-20 21:55:55 +00:00
Roman Tereshin	31d52847ef	Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size" This reapplies commit r337489 reverted by r337541 Additionally, this commit contains a speculative fix to the issue reported in r337541 (the report does not contain an actionable reproducer, just a stack trace) llvm-svn: 337606	2018-07-20 20:10:04 +00:00
Craig Topper	28ac623f6f	[X86] Remove isel patterns for MOVSS/MOVSD ISD opcodes with integer types. Ideally our ISD node types going into the isel table would have types consistent with their instruction domain. This prevents us having to duplicate patterns with different types for the same instruction. Unfortunately, it seems our shuffle combining is currently relying on this a little remove some bitcasts. This seems to enable some switching between shufps and shufd. Hopefully there's some way we can address this in the combining. Differential Revision: https://reviews.llvm.org/D49280 llvm-svn: 337590	2018-07-20 17:57:53 +00:00
Simon Pilgrim	70fcd0f481	[X86][XOP] Fix SUB constant folding for VPSHA/VPSHL shift lowering We can safely use getConstant here as we're still lowering, which allows constant folding to kick in and simplify the vector shift codegen. Noticed while working on D49562. llvm-svn: 337578	2018-07-20 16:55:18 +00:00
Evandro Menezes	fffa9b5897	[ARM] Add new feature to enable optimizing the VFP registers Enable the optimization of operations on DPR and SPR via a feature instead of checking the target. Differential revision: https://reviews.llvm.org/D49463 llvm-svn: 337575	2018-07-20 16:49:28 +00:00
Simon Pilgrim	c7132031a2	[X86][SSE] Use SplitOpsAndApply to improve HADD/HSUB lowering Improve AVX1 256-bit vector HADD/HSUB matching by using SplitOpsAndApply to split into 128-bit instructions. llvm-svn: 337568	2018-07-20 16:20:45 +00:00
Simon Pilgrim	a85b86a982	[X86][AVX] Add support for i16 256-bit vector horizontal op redundant shuffle removal llvm-svn: 337566	2018-07-20 15:51:01 +00:00
Simon Pilgrim	a2bc2d488c	[X86][AVX] Add v16i16 horizontal op redundant shuffle tests llvm-svn: 337565	2018-07-20 15:41:15 +00:00
Nirav Dave	25802ac9fd	[DAG] Avoid Node Update assertion due to AND simplification Check for construction-time folding for incomplete AND nodes in BackwardsPropagateMask. Fixes PR38185. Reviewers: RKSimon, samparker Reviewed By: samparker Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D49444 llvm-svn: 337563	2018-07-20 15:27:24 +00:00
Simon Pilgrim	7c56bce996	[X86][AVX] Add support for 32/64 bits 256-bit vector horizontal op redundant shuffle removal llvm-svn: 337561	2018-07-20 15:24:12 +00:00
Nirav Dave	5a4e11ad9c	[DAG] Fix Memory ordering check in ReduceLoadOpStore. When merging through a TokenFactor we need to check that the load may be ordered such that no other aliasing memory operations may happen. It is not sufficient to just check that the load is a member of the chain token factor as it there may be a indirect chain. Require the load's chain has only one use. This fixes PR37826. Reviewers: spatel, davide, efriedma, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49388 llvm-svn: 337560	2018-07-20 15:20:50 +00:00
Simon Pilgrim	8342126c4e	[X86][AVX] Add 256-bit vector horizontal op redundant shuffle tests llvm-svn: 337558	2018-07-20 15:07:53 +00:00
Simon Pilgrim	f907e19b5e	Regenerate partial vector fold test. NFCI. llvm-svn: 337551	2018-07-20 13:58:57 +00:00
Simon Pilgrim	cbf5af12b0	Regenerate remainder test. llvm-svn: 337546	2018-07-20 13:14:29 +00:00
Ulrich Weigand	9dd23b8433	[SystemZ] Test case formatting fixes Fix systematically wrong whitespace from a prior automated change. NFC. llvm-svn: 337542	2018-07-20 12:12:10 +00:00
Sam McCall	57743883f1	Revert "[LSV] Refactoring + supporting bitcasts to a type of different size" This reverts commit r337489. It causes asserts to fire in some TensorFlow tests, e.g. tensorflow/compiler/tests/gather_test.py on GPU. Example stack trace: Start test case: GatherTest.testHigherRank assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits" @ 0x5559446ebe10 __assert_fail @ 0x55593ef32f5e llvm::APInt::trunc() @ 0x55593d78f86e (anonymous namespace)::Vectorizer::lookThroughComplexAddresses() @ 0x55593d78f2bc (anonymous namespace)::Vectorizer::areConsecutivePointers() @ 0x55593d78d128 (anonymous namespace)::Vectorizer::isConsecutiveAccess() @ 0x55593d78c926 (anonymous namespace)::Vectorizer::vectorizeInstructions() @ 0x55593d78c221 (anonymous namespace)::Vectorizer::vectorizeChains() @ 0x55593d78b948 (anonymous namespace)::Vectorizer::run() @ 0x55593d78b725 (anonymous namespace)::LoadStoreVectorizer::runOnFunction() @ 0x55593edf4b17 llvm::FPPassManager::runOnFunction() @ 0x55593edf4e55 llvm::FPPassManager::runOnModule() @ 0x55593edf563c (anonymous namespace)::MPPassManager::runOnModule() @ 0x55593edf5137 llvm::legacy::PassManagerImpl::run() @ 0x55593edf5b71 llvm::legacy::PassManager::run() @ 0x55593ced250d xla::gpu::IrDumpingPassManager::run() @ 0x55593ced5033 xla::gpu::(anonymous namespace)::EmitModuleToPTX() @ 0x55593ced40ba xla::gpu::(anonymous namespace)::CompileModuleToPtx() @ 0x55593ced33d0 xla::gpu::CompileToPtx() @ 0x55593b26b2a2 xla::gpu::NVPTXCompiler::RunBackend() @ 0x55593b21f973 xla::Service::BuildExecutable() @ 0x555938f44e64 xla::LocalService::CompileExecutable() @ 0x555938f30a85 xla::LocalClient::Compile() @ 0x555938de3c29 tensorflow::XlaCompilationCache::BuildExecutable() @ 0x555938de4e9e tensorflow::XlaCompilationCache::CompileImpl() @ 0x555938de3da5 tensorflow::XlaCompilationCache::Compile() @ 0x555938c5d962 tensorflow::XlaLocalLaunchBase::Compute() @ 0x555938c68151 tensorflow::XlaDevice::Compute() @ 0x55593f389e1f tensorflow::(anonymous namespace)::ExecutorState::Process() @ 0x55593f38a625 tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()() * SIGABRT received by PID 7798 (TID 7837) from PID 7798; * llvm-svn: 337541	2018-07-20 12:03:00 +00:00
Jonas Paulsson	c88d3f6a99	[SystemZ] Reimplent SchedModel IssueWidth and WriteRes/ReadAdvance mappings. As a consequence of recent discussions (http://lists.llvm.org/pipermail/llvm-dev/2018-May/123164.html), this patch changes the SystemZ SchedModels so that the IssueWidth is 6, which is the decoder capacity, and NumMicroOps become the number of decoder slots needed per instruction. In addition, the SchedWrite latencies now match the MachineInstructions def-operand indexes, and ReadAdvances have been added on instructions with one register operand and one memory operand. Review: Ulrich Weigand https://reviews.llvm.org/D47008 llvm-svn: 337538	2018-07-20 09:40:43 +00:00
Matt Arsenault	4bec7d4261	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering" Reverts r337079 with fix for msan error. llvm-svn: 337535	2018-07-20 09:05:08 +00:00
Stephen Canon	34f5867310	Add x86_64-unkown triple to llc for x86 test. llvm-svn: 337523	2018-07-20 03:50:55 +00:00

1 2 3 4 5 ...

25241 Commits