llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	adda256a7d	[ARM] Regenerate rotation tests llvm-svn: 367214	2019-07-29 09:48:07 +00:00
Simon Pilgrim	251b546f1b	[AMDGPU] Regenerate v2i16 insertelement tests. To help show the diffs from an upcoming SimplifyDemandedBits patch. llvm-svn: 367213	2019-07-29 09:47:07 +00:00
David Stuttard	20235ef3e7	[AMDGPU] Enable v4f16 and above for v_pk_fma instructions Summary: If isel is presented with <2 x half> vectors then it will correctly select v_pk_fma style instructions. If isel is presented with e.g. <4 x half> vectors it will scalarize, unlike for other instruction types (such as fadd, fmul etc.) Added extra support to enable this. Updated one of the tests to include a test for this (as well as extending the test to GFX9) Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65325 Change-Id: I50a4577a3f8223fb53992af3b7d26121f65b71ee llvm-svn: 367206	2019-07-29 08:15:10 +00:00
George Rimar	aef03e86c1	[obj2yaml] - Report a error when unable to resolve a sh_link reference properly. Because of a bug we did not report a error in the case shown in the test. With this patch we do. Differential revision: https://reviews.llvm.org/D65214 llvm-svn: 367203	2019-07-29 07:58:29 +00:00
George Rimar	99f73ebe5c	[llvm-objcopy] - Reimplement strip-dwo-groups.test to stop using the precompiled object. When llvm-copy removes .dwo sections the index of symbol table, the indices of the symbols and the indices of the sections which go after the removed ones changes. That affects on SHT_GROUP sections, which needs to be updated. Initially this test used a precompiled object, I rewrote it to use YAML and improved a bit. Differential revision: https://reviews.llvm.org/D65273 llvm-svn: 367202	2019-07-29 07:55:39 +00:00
Craig Topper	eb1beabad9	[X86] Don't use PMADDWD for vector add reductions of multiplies if the mul inputs have an additional user. The pmaddwd inserts a truncate, if that truncate would end up creating additional instructions instead of making a zext narrower, then we shouldn't do it. I've restricted this to only sse4.1 targets since on prior targets the zext will be done in stages. So the truncate will probably not create additional instructions. Might need some more investigation of mul shrinking and the other pmaddwd transform to be sure this is the right decision. There might be a slight regression on AVX1 targets due to add splitting. Hard to say for sure. Maybe we need to look into using the vector reduction flag to use 2 narrow loads and a blend instead of extracting and inserting. llvm-svn: 367198	2019-07-29 01:36:58 +00:00
Craig Topper	ac9d0f4150	[X86] Add test cases to show missing one use check in combineLoopMAddPattern. llvm-svn: 367197	2019-07-29 01:36:54 +00:00
Roman Lebedev	6ff633ddc4	[NFC][InstCombine] Revisit tests in shift-amount-reassociation-with-truncation-shl.ll llvm-svn: 367196	2019-07-28 21:31:58 +00:00
Craig Topper	894916cac9	[X86] In combineLoopMAddPattern and combineLoopSADPattern, preserve the vector reduction flag on the final add. Handle unrolled loops by letting DAG combine revisit. This reverts r340478 and r340631 and replaces them with a simpler method of just letting DAG combine revisit the nodes to handle the other operand. llvm-svn: 367195	2019-07-28 18:45:42 +00:00
Sanjay Patel	99c57c6daf	[InstCombine] fold fsub+fneg with fdiv/fmul between The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194	2019-07-28 17:10:06 +00:00
David Green	b8b8b46a51	[ARM] MVE VPNOT This adds the patterns required to transform xor P0, -1 to a VPNOT. The instruction operands have to change a little for this, adding an in and an out VCCR reg and using a custom DecodeMVEVPNOT for the decode. Differential Revision: https://reviews.llvm.org/D65133 llvm-svn: 367192	2019-07-28 14:07:48 +00:00
David Green	9cf344e739	[ARM] Better patterns for fp <> predicate vectors These are some better patterns for converting between predicates and floating points. Much like the extends, we select "1"/"-1" or "0" depending on the predicate value. Or we perform a compare against 0 to convert to a predicate. Differential Revision: https://reviews.llvm.org/D65103 llvm-svn: 367191	2019-07-28 13:53:39 +00:00
Roman Lebedev	d5bc4b09f1	[NFC][InstCombine] Shift amount reassociation: can have trunc between shl's https://rise4fun.com/Alive/OQbM Not so simple for lshr/ashr, so those maybe later. https://bugs.llvm.org/show_bug.cgi?id=42391 llvm-svn: 367189	2019-07-28 13:13:46 +00:00
Hideto Ueno	e7bea9b73a	[Attributor] Deduce "align" attribute Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187	2019-07-28 07:04:01 +00:00
Hideto Ueno	cc0a4cdc89	[FunctionAttrs] Annotate "willreturn" for intrinsics Summary: In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef). In this patch, willreturn is annotated for LLVM intrinsics. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64904 llvm-svn: 367184	2019-07-28 06:09:56 +00:00
Joerg Sonnenberger	791951bd32	Stricter check for the memory access. The current pattern would trigger for scheduling changes of the post-load computation, since those are commutable with the inline asm. Avoid this by explicitly check the order of load vs asm block. llvm-svn: 367180	2019-07-27 18:57:59 +00:00
Simon Pilgrim	37a32f3c96	Regenerate UXTB tests llvm-svn: 367179	2019-07-27 18:44:15 +00:00
Simon Pilgrim	062cd8bb1d	[AMDGPU] Regenerate tests. To help show the diffs from an upcoming SimplifyDemandedBits patch. llvm-svn: 367175	2019-07-27 14:32:23 +00:00
Simon Pilgrim	603f94aa2a	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174	2019-07-27 14:11:59 +00:00
Sanjay Patel	02b9e45a7e	[InstSimplify] remove quadratic time looping (PR42771) The test case from: https://bugs.llvm.org/show_bug.cgi?id=42771 ...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is seemingly done just to avoid invalidating the instruction iterator. We can instead delay instruction deletion until we reach the end of the block (or we could delay until we reach the end of all blocks). There's a test diff here for a degenerate case with llvm.assume that is not meaningful in itself, but serves to verify this change in logic. This change probably doesn't result in much overall compile-time improvement because we call '-instsimplify' as a standalone pass only once in the standard -O2 opt pipeline currently. Differential Revision: https://reviews.llvm.org/D65336 llvm-svn: 367173	2019-07-27 14:05:51 +00:00
Simon Pilgrim	8a52671782	[SelectionDAG] Check for any recursion depth greater than or equal to limit instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171	2019-07-27 12:48:46 +00:00
Simon Atanasyan	6faac434ed	[mips] Add (dis)assembler tests for beqzl and bnezl instructions. NFC llvm-svn: 367168	2019-07-27 08:13:27 +00:00
Amara Emerson	7bc4fad0fb	[AArch64][GlobalISel] Implement narrowing of G_SEXT. We need this to narrow a sext to s128. Differential Revision: https://reviews.llvm.org/D65357 llvm-svn: 367164	2019-07-26 23:46:38 +00:00
Jessica Paquette	aa8b9993c2	[AArch64][GlobalISel] Select @llvm.aarch64.stlxr for 32-bit pointers Add partial instruction selection for intrinsics like this: ``` declare i32 @llvm.aarch64.stlxr(i64, i32*) ``` (This only handles the case where a G_ZEXT is feeding the intrinsic.) Also make sure that the added store instruction actually has the memory op from the original G_STORE. Update select-stlxr-intrin.mir and arm64-ldxr-stxr.ll. Differential Revision: https://reviews.llvm.org/D65355 llvm-svn: 367163	2019-07-26 23:28:53 +00:00
Sanjay Patel	d20a0fe203	[InstCombine] add tests for fsub with negated operand; NFC llvm-svn: 367156	2019-07-26 21:12:22 +00:00
Wei Mi	55a68a2400	[JumpThreading] Stop searching predecessor when the current bb is in a unreachable loop. updatePredecessorProfileMetadata in jumpthreading tries to find the first dominating predecessor block for a PHI value by searching upwards the predecessor block chain. But jumpthreading may see some temporary IR state which contains unreachable bb not being cleaned up. If an unreachable loop happens to be on the predecessor block chain, keeping chasing the predecessor block will run into an infinite loop. The patch fixes it. Differential Revision: https://reviews.llvm.org/D65310 llvm-svn: 367154	2019-07-26 20:59:22 +00:00
Sanjay Patel	a9ab31558c	[InstCombine] canonicalize negated operand of fdiv This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146	2019-07-26 19:56:59 +00:00
Sanjay Patel	487e957775	[InstCombine] add tests for fdiv with negated operand; NFC llvm-svn: 367145	2019-07-26 19:44:53 +00:00
Bob Haarman	6baac18a76	add 'a' to chmod in llvm-lipo executability tests Summary: When specifying symbolic permissions with + or -, if none of a/u/g/o are specified, bits set in the umask are not affected. This caused the llvm-lipo executability tests to fail on some systems, e.g. having an umask of 027 would cause chmod -x to not clear the executable bit for others. This change instead uses chmod a-x, which clears all the executable bits regardless of umask. Reviewers: smeenai, hans, anushabasana Reviewed By: smeenai Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65342 llvm-svn: 367142	2019-07-26 18:44:06 +00:00
Vlad Tsyrklevich	485b8789de	Revert "[X86][SSE] Replace PMULDQ GetDemandedBits combine with SimplifyMultipleUseDemandedBits handler." This reverts r367100, it appears to be causing test failures after Nico's revert of r367091. llvm-svn: 367141	2019-07-26 18:14:21 +00:00
Sean Fertile	9df6177d38	[PowerPC][AIX]Add lowering of MCSymbol MachineOperand. Adds machine operand lowering for MCSymbolSDNodes to the PowerPC backend. This is needed to produce call instructions in assembly for AIX because the callee operand is a MCSymbolSDNode. The test is XFAIL'ed for asserts due to a (valid) assertion in PEI that the AIX ABI isn't supported yet. Differential Revision: https://reviews.llvm.org/D63738 llvm-svn: 367133	2019-07-26 17:25:27 +00:00
Sergey Dmitriev	cdeaac5dce	[llvm-objcopy] Add support for --add-section for COFF This patch enables support for --add-section=... option for COFF objects. Differential Revision: https://reviews.llvm.org/D65040 llvm-svn: 367130	2019-07-26 17:06:41 +00:00
Cullen Rhodes	2cde8b5db6	[AArch64][SVE2] Rename bitperm feature to sve2-bitperm Summary: The bitperm feature flag is now prefixed with SVE2, as it is for all other SVE2 extensions Patch by Maciej Gabka. Reviewers: sdesmalen, rovka, chill, SjoerdMeijer, rengolin Reviewed By: SjoerdMeijer, rengolin Differential Revision: https://reviews.llvm.org/D65327 llvm-svn: 367124	2019-07-26 15:57:50 +00:00
Nico Weber	13f337c4cb	Revert r367091, it caused PR42777. llvm-svn: 367118	2019-07-26 14:58:42 +00:00
Petar Avramovic	cf21794566	[MIPS GlobalISel] Fix check for void return during lowerCall Void return used to have unsigned with value 0 for virtual register but with addition of Register class and changes to arguments to lowerCall this is no longer valid. Check for void return by inspecting the Ty field in OrigRet. Differential Revision: https://reviews.llvm.org/D65321 llvm-svn: 367107	2019-07-26 13:19:37 +00:00
Petar Avramovic	b1fc6f6130	[MIPS GlobalISel] Select inttoptr and ptrtoint Select G_INTTOPTR and G_PTRTOINT for MIPS32. Differential Revision: https://reviews.llvm.org/D65217 llvm-svn: 367104	2019-07-26 13:08:06 +00:00
Sanjay Patel	c229cfeb7a	[InstCombine] remove flop from lerp patterns (Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101	2019-07-26 11:19:18 +00:00
Simon Pilgrim	d93e8ece7b	[X86][SSE] Replace PMULDQ GetDemandedBits combine with SimplifyMultipleUseDemandedBits handler. This removes a GetDemandedBits user and allows us to benefit from the DemandedElts propagated through SimplifyDemandedBits. llvm-svn: 367100	2019-07-26 11:10:20 +00:00
Carl Ritson	00e89b428b	[AMDGPU] Add llvm.amdgcn.softwqm intrinsic Add llvm.amdgcn.softwqm intrinsic which behaves like llvm.amdgcn.wqm only if there is other WQM computation in the shader. Reviewers: nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64935 llvm-svn: 367097	2019-07-26 09:54:12 +00:00
Simon Pilgrim	9758407bf1	[TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG support. llvm-svn: 367096	2019-07-26 09:41:08 +00:00
Simon Pilgrim	cb5f7de448	[ARM][ParallelDSP] Regenerate multi-use-loads.ll test checks llvm-svn: 367094	2019-07-26 09:32:21 +00:00
Momchil Velikov	898d953693	[AArch64] Define ETE and TRBE system registers Embedded Trace Extension and Trace Buffer Extension are optional future architecture extensions. (cf. https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools) Their system registers are documented here: https://developer.arm.com/docs/ddi0601/a ETE shares register names with ETM. One exception is the ETE TRCEXTINSELR0 register, which has the same encoding as the ETM TRCEXTINSELR register (but different semantics). This patch treats them as aliases: the assembler will accept both names, emitting identical encoding, and the disassembler will keep disassembling to TRCEXRINSELR. Differential Revision: https://reviews.llvm.org/D63707 llvm-svn: 367093	2019-07-26 09:19:08 +00:00
Simon Pilgrim	b32ceb79b0	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091	2019-07-26 08:38:39 +00:00
Sam Parker	c760b5da11	[ARM][LowOverheadLoops] Add CPSR defs Both WhileLoopStart and LoopEnd may get turned into a cmp and br pair, so add an implicit def to these pseudo instructions in case that WLS and LE aren't generated. Differential Revision: https://reviews.llvm.org/D65275 llvm-svn: 367089	2019-07-26 08:15:01 +00:00
Pengfei Wang	9ad565f70e	[WinEH] Allocate space in funclets stack to save XMM CSRs Summary: This is an alternate approach to D57970. Currently funclets reuse the same stack slots that are used in the parent function for saving callee-saved xmm registers. If the parent function modifies a callee-saved xmm register before an excpetion is thrown, the catch handler will overwrite the original saved value. This patch allocates space in funclets stack for saving callee-saved xmm registers and uses RSP instead RBP to access memory. Reviewers: andrew.w.kaylor, LuoYuanke, annita.zhang, craig.topper, RKSimon Subscribers: rnk, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63396 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 367088	2019-07-26 07:33:15 +00:00
Kang Zhang	4e794a8bae	Some case eror for: detected memory leaks llvm-svn: 367083	2019-07-26 03:25:58 +00:00
Matt Arsenault	a9ea8a9aae	AMDGPU/GlobalISel: Handle most function return types handleAssignments gives up pretty easily on structs, and i8 values for some reason. The other case that doesn't work is when an implicit sret needs to be inserted if the return size exceeds the number of return registers. llvm-svn: 367082	2019-07-26 02:36:05 +00:00
Matt Arsenault	51d795d941	GlobalISel: Fold out unmerge to scalars from concat_vector Removes illegal intermediate vectors if an operation was lowering to concat_vectors, and the next operation is scalarized. llvm-svn: 367081	2019-07-26 02:22:23 +00:00
Kang Zhang	5c61015455	[PowerPC] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Below is an example ``` BB: \| BB: XOR 3, 3, 4 \| XOR 3, 3, 4 B TBB \| B ChainBB ... \| ... ChainBB: \| ChainBB: B TBB \| ADD 3, 3, 4 ... \| BLR TBB: \| ADD 3, 3, 4 \| BLR \| ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 367080	2019-07-26 01:58:53 +00:00
Francis Visoiu Mistrih	2d8fdcae96	Reland: [Remarks] Add support for serializing metadata for every remark streamer This allows every serializer format to implement metaSerializer() and return the corresponding meta serializer. Original llvm-svn: 366946 Reverted llvm-svn: 367004 This fixes the unit tests on Windows bots. llvm-svn: 367078	2019-07-26 01:33:30 +00:00

1 2 3 4 5 ...

63620 Commits