llvm-project

Author	SHA1	Message	Date
James Y Knight	646c4032e7	Fix two issues in MergeConsecutiveStores: 1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. llvm-svn: 251816	2015-11-02 18:48:08 +00:00
Jonas Paulsson	72640f1c9f	[MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves. Since the verifier will give false reports if it incorrectly thinks MI is loading or storing using an FI, it is necessary to scan memoperands and find out how the FI is used in the instruction. This should be relatively rare. Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag. Reviewed by Quentin Colombet. llvm-svn: 251620	2015-10-29 08:28:35 +00:00
Matthias Braun	f2f194455f	Revert "ScheduleDAGInstrs: Remove IsPostRA flag" It broke 3 arm testcases. This reverts commit r251608. llvm-svn: 251615	2015-10-29 05:06:41 +00:00
Matthias Braun	dc7580aa88	MachineScheduler: Fix typo in debug message Maybe I just missed the humor there ;-) llvm-svn: 251609	2015-10-29 03:57:28 +00:00
Matthias Braun	7ffadd0087	ScheduleDAGInstrs: Remove IsPostRA flag This was a layering violation in ScheduleDAGInstrs (and MachineSchedulerBase) they both shouldn't know directly whether they are used by the PostMachineScheduler or the MachineScheduler. llvm-svn: 251608	2015-10-29 03:57:24 +00:00
Matthias Braun	b0c437bc76	MachineScheduler: Use ranged for and slightly simplify the code llvm-svn: 251607	2015-10-29 03:57:17 +00:00
Tim Northover	2d4d161519	ARM: support .watchos_version_min and .tvos_version_min. These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569	2015-10-28 22:36:05 +00:00
Sanjoy Das	1d1929aace	[ValueTracking] Use !range metadata more aggressively in KnownBits Summary: Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more aggressively. Reviewers: majnemer, nlewycky, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14100 llvm-svn: 251487	2015-10-28 03:20:15 +00:00
Sanjoy Das	4ff3cf6d92	[SelectionDAG] Don't inspect !range metadata for extended loads Summary: Don't call `computeKnownBitsFromRangeMetadata` for extended loads -- this can cause a mismatch between the width of the !range metadata and the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the future). This isn't a problem now, but will be after a future change. Note: this can be made more aggressive in the future. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14107 llvm-svn: 251486	2015-10-28 03:20:10 +00:00
James Y Knight	14eedd189b	Make the SelectionDAG graph printer use SDNode::PersistentId labels. r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465	2015-10-27 23:09:03 +00:00
Sanjay Patel	bbd4c79c8f	Use the 'arcp' fast-math-flag when combining repeated FP divisors This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450	2015-10-27 20:27:25 +00:00
Cong Hou	07eeb8001e	Create a new interface addSuccessorWithoutWeight(MBB) in MBB to add successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429	2015-10-27 17:59:36 +00:00
Mehdi Amini	891c0973df	Do not use "else" when both branches return (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251398	2015-10-27 08:12:08 +00:00
Steve King	fee370be72	Fix llc crash processing S/UREM for -Oz builds caused by rL250825. When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373	2015-10-27 00:14:06 +00:00
Ivan Krasin	465fbe25c4	Fix indents. It's a follow up to r251353. llvm-svn: 251364	2015-10-26 22:35:40 +00:00
Ivan Krasin	298639a5fd	Move imported entities into DwarfCompilationUnit to speed up LTO linking. Summary: In particular, this CL speeds up the official Chrome linking with LTO by 1.8x. See more details in https://crbug.com/542426 Reviewers: dblaikie Subscribers: jevinskie Differential Revision: http://reviews.llvm.org/D13918 llvm-svn: 251353	2015-10-26 21:36:35 +00:00
David Blaikie	7b54b525cd	Remove assert(false) in favor of asserting the if conditional it is contained within. Also adjust the code to avoid 3 redundant map lookups. llvm-svn: 251327	2015-10-26 18:41:13 +00:00
Evgeniy Stepanov	d1aad26589	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324	2015-10-26 18:28:25 +00:00
Elena Demikhovsky	092858588a	Scalarizer for masked.gather and masked.scatter intrinsics. When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237	2015-10-25 15:37:55 +00:00
Michael Kuperstein	eaa16005af	[X86] Use correct calling convention for MCU psABI libcalls When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223	2015-10-25 08:14:05 +00:00
Rafael Espindola	84921b9860	Refactor: Simplify boolean conditional return statements in lib/CodeGen. Patch by Richard. llvm-svn: 251213	2015-10-24 23:11:13 +00:00
Simon Pilgrim	3448cbcc51	[DAGCombiner] Tidy up ConstantFP commutation. NFCI Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203	2015-10-24 20:06:18 +00:00
Simon Pilgrim	7430804fe1	[DAGCombiner] Generalize masking of constant rotates. We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197	2015-10-24 18:44:52 +00:00
Simon Pilgrim	d5ef318b5b	[X86][XOP] Add support for lowering vector rotations This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188	2015-10-24 13:17:26 +00:00
Joseph Tremoulet	3d0fbf1d74	[CodeGen] Mark setjmp/catchret MBBs address-taken Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113	2015-10-23 15:06:05 +00:00
Davide Italiano	fbb958c24b	[CodeGen] Remove usage of NDEBUG in header. Moreover, this seems unused. llvm-svn: 251081	2015-10-23 00:17:40 +00:00
Matthias Braun	61f4d6439c	MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic llvm-svn: 251037	2015-10-22 18:07:31 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Zia Ansari	8f509a7044	[X86] - Catch extra combine opportunities for redundant imuls. When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028	2015-10-22 16:14:45 +00:00
David Majnemer	a8f17871e4	[WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel It's a relic from the earlier implementation, let's remove it. llvm-svn: 250964	2015-10-21 23:20:39 +00:00
Matt Arsenault	29f9663f97	LegalizeDAG: Implement promote for build_vector This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945	2015-10-21 21:10:10 +00:00
Elena Demikhovsky	3ad76a1acd	Masked Load/Store optimization for scalar code When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893	2015-10-21 11:50:54 +00:00
Jonas Paulsson	17ad04535f	Let MachineVerifier be aware of mem-to-mem instructions. A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885	2015-10-21 07:39:47 +00:00
Krzysztof Parzyszek	fdb7b693a7	Tail duplication can mix incompatible registers in phi nodes Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877	2015-10-21 02:40:06 +00:00
Artyom Skrobov	c736863a85	Two switch blocks in VectorLegalizer::LegalizeOp already have a default: llvm_unreachable("This action is not supported yet!"); -- so I'm adding one to the third switch block, too. This is a follow-up fix for http://reviews.llvm.org/D13862 llvm-svn: 250830	2015-10-20 15:06:37 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Artyom Skrobov	b844fa7fc0	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825	2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith	a25ad0685a	AsmPrinter: Remove implicit ilist iterator conversion, NFC llvm-svn: 250776	2015-10-20 00:36:08 +00:00
Cong Hou	7745dbc5c4	Enhance loop rotation with existence of profile data in MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754	2015-10-19 23:16:40 +00:00
Sanjay Patel	69a50a1e17	[CGP] transform select instructions into branches and sink expensive operands This was originally checked in at r250527, but reverted at r250570 because of PR25222. There were at least 2 problems: 1. The cost check was checking for an instruction with an exact cost of TCC_Expensive; that should have been >=. 2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions; we can't sink instructions that may have side effects / are not safe to execute speculatively. Fixed those conditions in sinkSelectOperand() and added test cases. Original commit message: This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250743	2015-10-19 21:59:12 +00:00
Owen Anderson	faf5187ee0	Restore the original behavior of SelectionDAG::getTargetIndex(). It looks like an extra negation snuck in as apart of restoring it. llvm-svn: 250726	2015-10-19 19:27:40 +00:00
Benjamin Kramer	2002aadaad	Put back SelectionDAG::getTargetIndex. While technically this is untested dead code, it has out-of-tree users. This reverts a part of r250434. llvm-svn: 250717	2015-10-19 18:26:16 +00:00
Matthias Braun	e734195ce3	Revert "RegisterPressure: allocatable physreg uses are always kills" This reverts commit r250596. Reverted for now as the commit triggers assert in the AMDGPU target pending investigation. llvm-svn: 250713	2015-10-19 17:44:22 +00:00
Elena Demikhovsky	20662e39f1	Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore(). Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686	2015-10-19 07:43:38 +00:00
Simon Pilgrim	04d52d26f6	Use SDValue bool check. NFCI. llvm-svn: 250653	2015-10-18 12:33:54 +00:00
Simon Pilgrim	c2c154e078	Move one-use variable inside test. NFC. llvm-svn: 250651	2015-10-18 11:47:23 +00:00
Simon Pilgrim	24057b9566	[DAG] Ensure vector constant folding uses correct scalar undef types Minor fix to D13665 found during post-commit review. llvm-svn: 250616	2015-10-17 16:49:43 +00:00
Matthias Braun	65e6d4a3f8	RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC Also do some cleanups comment improvements. llvm-svn: 250598	2015-10-17 01:03:44 +00:00
Matthias Braun	cdd2792aa6	RegisterPressure: allocatable physreg uses are always kills This property was already used in the code path when no liveness intervals are present. Unfortunately the code path that uses liveness intervals tried to query a cached live interval for an allocatable physreg, those are usually not computed so a conservative default was used. This doesn't affect any of the lit testcases. This is a foreclosure to upcoming changes which should be NFC but without this patch this tidbit wouldn't be NFC. llvm-svn: 250596	2015-10-17 00:46:57 +00:00
Matthias Braun	5105e05e8f	RegisterPressure: Remove 0 entries from PressureChange This should not change behaviour because as far as I can see all code reading the pressure changes has no effect if the PressureInc is 0. Removing these entries however does avoid unnecessary computation, and results in a more stable debug output. I want the stable debug output to check that some upcoming changes are indeed NFC and identical even at the debug output level. llvm-svn: 250595	2015-10-17 00:35:59 +00:00

... 108 109 110 111 112 ...

24929 Commits