llvm-project

Author	SHA1	Message	Date
Bob Wilson	00d09428fe	Remove unused conditional negate operations. llvm-svn: 127090	2011-03-05 16:54:31 +00:00
Evan Cheng	6e3d443646	Fix a typo which cause dag combine crash. rdar://9059537. llvm-svn: 126661	2011-02-28 18:45:27 +00:00
Stuart Hastings	67c5c3e939	Support for byval parameters on ARM. Will be enabled by a forthcoming patch to the front-end. Radar 7662569. llvm-svn: 126655	2011-02-28 17:17:53 +00:00
Evan Cheng	d6b641e5bc	More fcopysign correctness and performance fix. The previous codegen for the slow path (when values are in VFP / NEON registers) was incorrect if the source is NaN. The new codegen uses NEON vbsl instruction to copy the sign bit. e.g. vmov.i32 d1, #0x80000000 vbsl d1, d2, d0 If NEON is not available, it uses integer instructions to copy the sign bit. rdar://9034702 llvm-svn: 126295	2011-02-23 02:24:55 +00:00
Devang Patel	f3292b2196	Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns." In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body. This requires some coordination with debugger to get this working. - The debugger needs to be aware of prolog_end attribute attached with line table entries. - The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+) llvm-svn: 126155	2011-02-21 23:21:26 +00:00
Nate Begeman	fa62d50481	Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types. This avoids moving each element to the integer register file and calling __divsi3 etc. on it. llvm-svn: 125402	2011-02-11 20:53:29 +00:00
Evan Cheng	2da1c95993	Fix buggy fcopysign lowering. This define float @foo(float %x, float %y) nounwind readnone { entry: %0 = tail call float @copysignf(float %x, float %y) nounwind readnone ret float %0 } Was compiled to: vmov s0, r1 bic r0, r0, #-2147483648 vmov s1, r0 vcmpe.f32 s0, #0 vmrs apsr_nzcv, fpscr it lt vneglt.f32 s1, s1 vmov r0, s1 bx lr This fails to copy the sign of -0.0f because it's lost during the float to int conversion. Also, it's sub-optimal when the inputs are in GPR registers. Now it uses integer and + or operations when it's profitable. And it's correct! lsrs r1, r1, #31 bfi r0, r1, #31, #1 bx lr rdar://8984306 llvm-svn: 125357	2011-02-11 02:28:55 +00:00
Evan Cheng	e1a4ac9b5b	Fix an obvious typo which caused an isel assertion. rdar://8964854. llvm-svn: 125023	2011-02-07 18:50:47 +00:00
Bob Wilson	06fce87c4a	Add codegen support for using post-increment NEON load/store instructions. The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. llvm-svn: 125014	2011-02-07 17:43:21 +00:00
Evan Cheng	d42641c6b5	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Devang Patel	56cc5fdf09	Keep track of incoming argument's location while emitting LiveIns. llvm-svn: 124611	2011-01-31 21:38:14 +00:00
Anton Korobeynikov	f3a62314f3	Provide correct registers for EH stuff on ARM llvm-svn: 124151	2011-01-24 22:38:45 +00:00
Evan Cheng	2f2435d026	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Evan Cheng	b8b0ad80a8	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Andrew Trick	43f2563114	For ARM subtargets with useNEONForSinglePrecisionFP, double count uses of the floating point types less than 64-bits. It's somewhat of a temporary hack but forces more accurate modeling of register pressure and results in fewer spills. llvm-svn: 123811	2011-01-19 02:35:27 +00:00
Andrew Trick	5eb0a30216	whitespace llvm-svn: 123810	2011-01-19 02:26:13 +00:00
Evan Cheng	68aec147b7	Don't forget to emit the load from indirect symbol when using movw + movt to materialize GA indirect symbols. llvm-svn: 123809	2011-01-19 02:16:49 +00:00
Evan Cheng	dfce83c8f5	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Eric Christopher	2af9551ebf	Fix 80-cols. llvm-svn: 123494	2011-01-14 23:50:53 +00:00
Anton Korobeynikov	2f93128109	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Jakob Stoklund Olesen	2fb5b31578	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Bob Wilson	3fa9c064c2	Add an explanatory message for an assertion. llvm-svn: 123042	2011-01-07 23:40:46 +00:00
Matt Beaumont-Gay	5cc7a1fcad	Eliminate variable only used in debug builds. llvm-svn: 123040	2011-01-07 22:34:58 +00:00
Bob Wilson	6f2b8966ca	Lower some BUILD_VECTORS using VEXT+shuffle. Patch by Tim Northover. llvm-svn: 123035	2011-01-07 21:37:30 +00:00
Bob Wilson	8265d56638	Add ARM patterns to match EXTRACT_SUBVECTOR nodes. Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. llvm-svn: 122995	2011-01-07 04:59:04 +00:00
Evan Cheng	3ae2b79aa3	Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952	2011-01-06 06:52:41 +00:00
Bob Wilson	36be00ceb3	Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions. If the basic block containing the BCCi64 (or BCCZi64) instruction ends with an unconditional branch, that branch needs to be deleted before appending the expansion of the BCCi64 to the end of the block. llvm-svn: 122521	2010-12-23 22:45:49 +00:00
Bob Wilson	1a20c2aedd	Add ARM-specific DAG combining to cast i64 vector element load/stores to f64. Type legalization splits up i64 values into pairs of i32 values, which leads to poor quality code when inserting or extracting i64 vector elements. If the vector element is loaded or stored, it can be treated as an f64 value and loaded or stored directly from a VPR register. Use the pre-legalization DAG combiner to cast those vector elements to f64 types so that the type legalizer won't mess them up. Radar 8755338. llvm-svn: 122319	2010-12-21 06:43:19 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Bob Wilson	f268d0303b	Add some missing entries in ARMTargetLowering::getTargetNodeName. llvm-svn: 122111	2010-12-18 00:04:26 +00:00
Eric Christopher	347f4c32e8	Don't handle -arm-long-calls in fast isel for now. llvm-svn: 121919	2010-12-15 23:47:29 +00:00
Evan Cheng	c177813755	bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663 llvm-svn: 121746	2010-12-14 03:22:07 +00:00
Evan Cheng	2e51bb4ff0	Generalize BFI isel lowering a bit. llvm-svn: 121714	2010-12-13 20:32:54 +00:00
Evan Cheng	3434575704	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 llvm-svn: 121606	2010-12-11 04:11:38 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Evan Cheng	419ea286ee	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Bob Wilson	2d790df105	Add support for NEON VLD2-dup instructions. llvm-svn: 120236	2010-11-28 06:51:26 +00:00
Bob Wilson	62a6f7eda6	Add entry in getTargetNodeName() for ARMISD::VBICIMM. llvm-svn: 120233	2010-11-28 06:51:11 +00:00
Bob Wilson	d7d2cf7842	Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations. We need to check if the individual vector elements are sign/zero-extended values. For now this only handles constants values. Radar 8687140. llvm-svn: 120034	2010-11-23 19:38:38 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Evan Cheng	2debc86138	These instructions are thumb2 only. llvm-svn: 119793	2010-11-19 06:28:11 +00:00
Tanya Lattner	cd68095650	Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on illegal types (vector should be split first). Added test case. llvm-svn: 119749	2010-11-18 22:06:46 +00:00
Anton Korobeynikov	0eecf5d201	Move hasFP() and few related hooks to TargetFrameInfo. llvm-svn: 119740	2010-11-18 21:19:35 +00:00
Bob Wilson	7d47133ff7	Split up ARM LowerShift function. This function was being called from two different places for completely unrelated reasons. During type legalization, it was called to expand 64-bit shift operations. During operation legalization, it was called to handle Neon vector shifts. The vector shift code was not written to check for illegal types, since it was assumed to be only called after type legalization. Fixed this by splitting off the 64-bit shift expansion into a separate function. I don't have a particular testcase for this; I just noticed it by inspection. llvm-svn: 119738	2010-11-18 21:16:28 +00:00
Nate Begeman	ca52411955	Fix an issue where we tried to turn a v2f32 build_vector into a v4i32 build vector with 2 elts llvm-svn: 118720	2010-11-10 21:35:41 +00:00
Bob Wilson	193722ebc8	Do not use MEMBARRIER_MCR for any Thumb code. It is only supported for ARM code. Normally Thumb2 code would use DMB instead, but depending on how the compiler is invoked (e.g., -mattr=-db) that might be disabled. This prevents a "cannot select MEMBARRIER_MCR" error in that situation. Radar 8644195 llvm-svn: 118642	2010-11-09 22:50:44 +00:00
Jim Grosbach	a942ad4222	Change the ARMConstantPoolValue modifier string to an enumeration. This will help in MC'izing the references that use them. llvm-svn: 118633	2010-11-09 21:36:17 +00:00
Owen Anderson	c7baee31ad	Add support for ARM's specialized vector-compare-against-zero instructions. llvm-svn: 118453	2010-11-08 23:21:22 +00:00

1 2 3 4 5 ...

557 Commits