llvm-project

Author	SHA1	Message	Date
Krzysztof Parzyszek	c09a14eeb2	[Hexagon] Generate correct runtime check when recognizing memmove The check (assuming positive stride) for validity of memmove should be (a) the destination is at a lower address than the source, or (b) the distance between the source and destination is greater than or equal the number of bytes copied. For the second part it is sufficient to assume that the destination is at a higher address, since the opposite case is covered by (a). The distance calculation was previously done by subtracting the pointers in the wrong order. llvm-svn: 311650	2017-08-24 11:59:53 +00:00
Evgeny Astigeevich	540a39adf7	[ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes for the Thumb1 target. This causes generation of redundant code and affects performance. This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106 Differential Revision: https://reviews.llvm.org/D36467 llvm-svn: 311649	2017-08-24 10:00:25 +00:00
Sjoerd Meijer	afc2cd3c9e	[AArch64] Custom lowering of copysign f16 This is a follow up patch of r311154 and introduces custom lowering of copysign f16 to avoid promotions to single precision types when the subtarget supports fullfp16. Differential Revision: https://reviews.llvm.org/D36893 llvm-svn: 311646	2017-08-24 09:21:10 +00:00
Daniel Sanders	2c269f6bf8	Re-commit: [globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm Summary: This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction. The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel. Depends on D36085 The previous commit was reverted for breaking the build but this appears to have been the recurring problem on the Windows bots with tablegen not being re-run when llvm-tblgen is changed but the .td's aren't. If it re-occurs then forcing a build with clean=True should fix it but this string should do this in advance: Requires a clean build. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36086 llvm-svn: 311645	2017-08-24 09:11:20 +00:00
Matt Arsenault	d664315ae8	IPRA: Don't assume called function is first call operand Fixes not finding the called global for AMDGPU call pseudoinstructions, which prevented IPRA from doing much. llvm-svn: 311637	2017-08-24 07:55:15 +00:00
Chandler Carruth	dc2556934c	[x86] NFC: Clean up two tests and generate precise checks for them. Mostly this involved giving unnamed values names and running the IR through `opt` to re-format it but merging in any important comments in the original. I then deleted pointless comments and inlined the function attributes for ease of reading and editting. All of this is to make it much easier to see the instructions being generated here and evaluate any updates to the tests. llvm-svn: 311634	2017-08-24 07:38:36 +00:00
Igor Breger	47be5fbbe9	[GlobalISel][X86] Support G_IMPLICIT_DEF. Summary: Support G_IMPLICIT_DEF. Reviewers: zvi, guyblank, t.p.northover Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36733 llvm-svn: 311633	2017-08-24 07:06:27 +00:00
Wei Ding	a131d3fb29	Add ‘llvm.experimental.constrained.fma‘ Intrinsic. Differential Revision: http://reviews.llvm.org/D36335 llvm-svn: 311629	2017-08-24 04:18:24 +00:00
Hans Wennborg	c39ec95d88	[DAG] Fix Node Replacement in PromoteIntBinOp When one operand is a user of another in a promoted binary operation we may replace and delete the returned value before returning triggering an assertion. Reorder node replacements to prevent this. Fixes PR34137. Landing on behalf of Nirav. Differential Revision: https://reviews.llvm.org/D36581 llvm-svn: 311623	2017-08-24 01:08:27 +00:00
Dylan McKay	4f5002198b	[AVR] Use the correct register classes for 16-bit atomic operations llvm-svn: 311620	2017-08-24 00:14:38 +00:00
Aditya Nandakumar	efd8a84cd5	[GISEl]: Translate phi into G_PHI G_PHI has the same semantics as PHI but also has types. This lets us verify that the types in the G_PHI are consistent. This also allows specifying legalization actions for G_PHIs. https://reviews.llvm.org/D36990 llvm-svn: 311596	2017-08-23 20:45:48 +00:00
Reid Kleckner	6d353348e5	Parse and print DIExpressions inline to ease IR and MIR testing Summary: Most DIExpressions are empty or very simple. When they are complex, they tend to be unique, so checking them inline is reasonable. This also avoids the need for CodeGen passes to append to the llvm.dbg.mir named md node. See also PR22780, for making DIExpression not be an MDNode. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37075 llvm-svn: 311594	2017-08-23 20:31:27 +00:00
Lei Huang	0cb591fc4c	Update branch coalescing to be a PowerPC specific pass Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588	2017-08-23 19:25:04 +00:00
Craig Topper	853a8d9ffc	[AVX512] Don't create SHRUNKBLEND SDNodes for 512-bit vectors There are no 512-bit blend instructions so we shouldn't create SHRUNKBLEND for them. On a side note, it looks like there may be a missed opportunity for constant folding TESTM when LHS and RHS are equal. This fixes PR34139. Differential Revision: https://reviews.llvm.org/D36992 llvm-svn: 311572	2017-08-23 16:41:02 +00:00
Victor Leschuk	3697ebe25f	Revert r311546 as it breaks build http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/4394 llvm-svn: 311560	2017-08-23 15:21:10 +00:00
Daniel Sanders	c3885c4589	[globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm Summary: This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction. The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel. Depends on D36085 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36086 llvm-svn: 311546	2017-08-23 12:14:18 +00:00
Florian Hahn	5b92960091	[ARM] Check for assembler instructions in test. Currently this test causes test failures on some machines, due to isel not being registered. Update the test to run all passes and check emitted assembly instructions for now. llvm-svn: 311545	2017-08-23 11:53:24 +00:00
Florian Hahn	214e13d949	[ARM] Add missing patterns for insert_subvector. Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns. Reviewers: t.p.northover, olista01, javed.absar, rengolin Reviewed By: javed.absar Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36796 llvm-svn: 311543	2017-08-23 10:20:59 +00:00
Hiroshi Inoue	cc555bd0ac	[PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate - recommitting after fixing a test failure on MacOS On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x \| 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311538	2017-08-23 08:55:18 +00:00
Hiroshi Inoue	dbb285ca51	Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate This reverts commit rL311526 due to failures in some buildbot. llvm-svn: 311530	2017-08-23 06:38:05 +00:00
Hiroshi Inoue	c4449df1b0	[PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x \| 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311526	2017-08-23 05:15:15 +00:00
Dean Michael Berris	0884b73220	[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text Summary: This change achieves two things: - Redefine the Custom Event handling instrumentation points emitted by the compiler to not require dynamic relocation of references to the __xray_CustomEvent trampoline. - Remove the synthetic reference we emit at the end of a function that we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER associated with the section where the function is defined. To achieve the custom event handling change, we've had to introduce the concept of sled versioning -- this will need to be supported by the runtime to allow us to understand how to turn on/off the new version of the custom event handling sleds. That change has to land first before we change the way we write the sleds. To remove the synthetic reference, we rely on a relatively new linker feature that preserves the sections that are associated with each other. This allows us to limit the effects on the .text section of ELF binaries. Because we're still using absolute references that are resolved at runtime for the instrumentation map (and function index) maps, we mark these sections write-able. In the future we can re-define the entries in the map to use relative relocations instead that can be statically determined by the linker. That change will be a bit more invasive so we defer this for later. Depends on D36816. Reviewers: dblaikie, echristo, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36615 llvm-svn: 311525	2017-08-23 04:49:41 +00:00
Yonghong Song	dc1dbf6ef3	bpf: add variants of -mcpu=# and support for additional jmp insns -mcpu=# will support: . generic: the default insn set . v1: insn set version 1, the same as generic . v2: insn set version 2, version 1 + additional jmp insns . probe: the compiler will probe the underlying kernel to decide proper version of insn set. We did not not use -mcpu=native since llc/llvm will interpret -mcpu=native as the underlying hardware architecture regardless of -march value. Currently, only x86_64 supports -mcpu=probe. Other architecture will silently revert to "generic". Also added -mcpu=help to print available cpu parameters. llvm will print out the information only if there are at least one cpu and at least one feature. Add an unused dummy feature to enable the printout. Examples for usage: $ llc -march=bpf -mcpu=v1 -filetype=asm t.ll $ llc -march=bpf -mcpu=v2 -filetype=asm t.ll $ llc -march=bpf -mcpu=generic -filetype=asm t.ll $ llc -march=bpf -mcpu=probe -filetype=asm t.ll $ llc -march=bpf -mcpu=v3 -filetype=asm t.ll 'v3' is not a recognized processor for this target (ignoring processor) ... $ llc -march=bpf -mcpu=help -filetype=asm t.ll Available CPUs for this target: generic - Select the generic processor. probe - Select the probe processor. v1 - Select the v1 processor. v2 - Select the v2 processor. Available features for this target: dummy - unused feature. Use +feature to enable a feature, or -feature to disable it. For example, llc -mcpu=mycpu -mattr=+feature1,-feature2 ... Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 311522	2017-08-23 04:25:57 +00:00
Matthias Braun	d6c0868da5	Fix tail-merge-after-mbp test The output of this test changed after the fix in r311520 to have -run-pass=block-placement behave like it does in a normal pipeline. Adjust the test. llvm-svn: 311521	2017-08-23 03:49:53 +00:00
Matthias Braun	8426d1342d	Add test case for r311511 This also changes the TailDuplicator to be configured explicitely pre/post regalloc rather than relying on the isSSA() flag. This was necessary to have `llc -run-pass` work reliably. llvm-svn: 311520	2017-08-23 03:17:59 +00:00
Sanjay Patel	0ab50f6d68	[x86] auto-generate full checks; NFC I don't see anything Darwin-specific here, so I made the target generic x86-64. llvm-svn: 311465	2017-08-22 16:27:00 +00:00
Sanjay Patel	40b8e3bfe5	[x86] simplify runs and auto-generate full checks I've replaced the two OS-specific runs with a generic run because there's no functional difference in the resulting output that we're checking. Also, the script still doesn't work with a Win target. llvm-svn: 311463	2017-08-22 16:21:45 +00:00
Renato Golin	c070c73d5e	[ARM] Avoid creating duplicate ANDs in SelectionDAG When expanding a BRCOND into a BR_CC, do not create an AND 1 if one already exists. Review: D36705 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311447	2017-08-22 11:02:45 +00:00
Renato Golin	f63d701669	[ARM] Call setBooleanContents(ZeroOrOneBooleanContent) The ARM backend should call setBooleanContents so that it can use known bits to make some optimizations. Review: D35821 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311446	2017-08-22 11:02:37 +00:00
Craig Topper	b49f0893b2	[X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results. This patch just adds a flag to prevent the APInt from changing width. Fixes PR34271. Differential Revision: https://reviews.llvm.org/D36996 llvm-svn: 311429	2017-08-22 05:40:17 +00:00
Evandro Menezes	bc11ca1a31	[AArch64] Restore the test of conditional branch fusion Restore the functionality of this test that was broken by https://reviews.llvm.org/rL306144. Differential revision: https://reviews.llvm.org/D36807 llvm-svn: 311389	2017-08-21 21:57:43 +00:00
Tim Northover	ef1fc5ae89	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311388	2017-08-21 21:56:11 +00:00
Sean Fertile	00393cce3a	[PPC] Refine checks for emiting TOC restore nop and tail-call eligibility. For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353	2017-08-21 17:35:32 +00:00
Craig Topper	8078dd2984	[X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions. Reviewers: RKSimon, spatel, zvi, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36938 llvm-svn: 311342	2017-08-21 16:04:04 +00:00
Stefan Pintilie	9495f33e45	[PowerPC] Check if the pre-increment PHI Node already exists Preparations to use the per-increment are sometimes done in the target independent pass Loop Strength Reduction. We try to detect them in the PowerPC specific pass so that they are not done twice and so that we do not add PHIs that are not required. Differential Revision: https://reviews.llvm.org/D36736 llvm-svn: 311332	2017-08-21 13:36:18 +00:00
Igor Breger	685889cf9b	[GlobalISel][X86] Support G_BRCOND operation. Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34754 llvm-svn: 311327	2017-08-21 10:51:54 +00:00
Igor Breger	1b5e3d3e28	[GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments. llvm-svn: 311321	2017-08-21 08:59:59 +00:00
Michael Zuckerman	bdb6673151	[InterLeaved] Adding lit test for future work interleaved load strid 3 llvm-svn: 311320	2017-08-21 08:56:39 +00:00
Chandler Carruth	98c51cbee1	[x86] Teach the "generic" x86 CPU to avoid patterns that are slow on widely used processors. This occured to me when I saw that we were generating 'inc' and 'dec' when for Haswell and newer we shouldn't. However, there were a few "X is slow" things that we should probably just set. I've avoided any of the "X is fast" features because most of those would be pretty serious regressions on processors where X isn't actually fast. The slow things are likely to be negligible costs on processors where these aren't slow and a significant win when they are slow. In retrospect this seems somewhat obvious. Not sure why we didn't do this a long time ago. Differential Revision: https://reviews.llvm.org/D36947 llvm-svn: 311318	2017-08-21 08:45:22 +00:00
Chandler Carruth	63dd5e0ef6	[x86] Handle more cases where we can re-use an atomic operation's flags rather than doing a separate comparison. This both saves an explicit comparision and avoids the use of `xadd` which introduces register constraints and other challenges to the generated code. The motivating case is from atomic reference counts where `1` is the sentinel rather than `0` for whatever reason. This can and should be lowered efficiently on x86 by just using a different flag, however the x86 code only handled the `0` case. There remains some further opportunities here that are currently hidden due to canonicalization. I've included test cases that show these and FIXMEs. However, I don't at the moment have any production use cases and they seem substantially harder to address. Differential Revision: https://reviews.llvm.org/D36945 llvm-svn: 311317	2017-08-21 08:45:19 +00:00
Sam Parker	b252ffd2cc	[ARM][AArch64] Cortex-A75 and Cortex-A55 support This patch introduces support for Cortex-A75 and Cortex-A55, Arm's latest big.LITTLE A-class cores. They implement the ARMv8.2-A architecture, including the cryptography and RAS extensions, plus the optional dot product extension. They also implement the RCpc AArch64 extension from ARMv8.3-A. Cortex-A75: https://developer.arm.com/products/processors/cortex-a/cortex-a75 Cortex-A55: https://developer.arm.com/products/processors/cortex-a/cortex-a55 Differential Revision: https://reviews.llvm.org/D36667 llvm-svn: 311316	2017-08-21 08:43:06 +00:00
Craig Topper	d6f4be97e6	[AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled. There's no functional difference between the AVX512DQ instructions if we're not masking. This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently. llvm-svn: 311308	2017-08-21 05:29:02 +00:00
Craig Topper	485cca1ecb	[AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table. llvm-svn: 311307	2017-08-21 05:03:28 +00:00
Craig Topper	d63b33f9c4	[AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node. We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load. llvm-svn: 311300	2017-08-20 19:47:00 +00:00
Igor Breger	88a3d5c855	[GlobalISel][X86] Support call ABI. Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported. Reviewers: zvi, guyblank, oren_ben_simhon Reviewed By: oren_ben_simhon Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34602 llvm-svn: 311279	2017-08-20 09:25:22 +00:00
Igor Breger	b3a860a5e8	[GlobalISel][X86] Support asimetric copy from/to GPR physical register. Usually this case generated by ABI lowering, it requare to performe trancate/anyext. llvm-svn: 311278	2017-08-20 07:14:40 +00:00
Chandler Carruth	9ef881efab	[x86] Fix an even stranger corner case where we have multiple levels of cmov self-refrencing. Pointed out by Amjad Aboud in code review, test case minorly simplified from the one he posted. llvm-svn: 311267	2017-08-19 23:35:50 +00:00
Craig Topper	a0319bb434	[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation. llvm-svn: 311263	2017-08-19 22:02:02 +00:00
Martin Storsjo	91522ffa12	[ARM] Check the right order for halves of VZIP/VUZP if both parts are used This is the exact same fix as in SVN r247254. In that commit, the fix was applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue is present for VZIP/VUZP as well. This fixes PR33921. Differential Revision: https://reviews.llvm.org/D36899 llvm-svn: 311258	2017-08-19 19:47:48 +00:00
Jatin Bhateja	6b4c205685	[DAGCombiner] Extending pattern detection for vector shuffle. Summary: If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reviewers: zvi, delena, RKSimon, thakis Reviewed By: RKSimon Subscribers: chandlerc, eladcohen, llvm-commits Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 311255	2017-08-19 18:08:59 +00:00

1 2 3 4 5 ...

21272 Commits