llvm-project

Author	SHA1	Message	Date
Nick Lewycky	77e030bca9	Replace custom dispatch code with two uses of InstVisitor. Improves compile-time performance. llvm-svn: 30896	2006-10-12 02:02:44 +00:00
Chris Lattner	41b442242d	Implement SROA of unions with mixed pointers/integers in them. This implements PR892 and Transforms/ScalarRepl/union-pointer.ll:test2 llvm-svn: 30825	2006-10-08 23:53:04 +00:00
Chris Lattner	05f8272afa	Implement Transforms/ScalarRepl/union-pointer.ll:test llvm-svn: 30823	2006-10-08 23:28:04 +00:00
Chris Lattner	2deeaeaca7	add a new SimplifyDemandedVectorElts method, which works similarly to SimplifyDemandedBits. The idea is that some operations can be simplified if not all of the computed elements are needed. Some targets (like x86) have a large number of intrinsics that operate on a single element, but pass other elts through unmodified. If those other elements are not needed, the intrinsics can be simplified to scalar operations, and insertelement ops can be removed. This turns (f.e.): ushort %Convert_sse(float %f) { %tmp = insertelement <4 x float> undef, float %f, uint 0 ; <<4 x float>> [#uses=1] %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1 ; <<4 x float>> [#uses=1] %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2 ; <<4 x float>> [#uses=1] %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3 ; <<4 x float>> [#uses=1] %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } into: ushort %Convert_sse(float %f) { entry: %tmp28 = sub float %f, 1.000000e+00 ; <float> [#uses=1] %tmp37 = mul float %tmp28, 5.000000e-01 ; <float> [#uses=1] %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0 ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } which improves codegen from: _Convert_sse: movss LCPI1_0, %xmm0 movss 4(%esp), %xmm1 subss %xmm0, %xmm1 movss LCPI1_1, %xmm0 mulss %xmm0, %xmm1 movss LCPI1_2, %xmm0 minss %xmm0, %xmm1 xorps %xmm0, %xmm0 maxss %xmm0, %xmm1 cvttss2si %xmm1, %eax andl $65535, %eax ret to: _Convert_sse: movss 4(%esp), %xmm0 subss LCPI1_0, %xmm0 mulss LCPI1_1, %xmm0 movss LCPI1_2, %xmm1 minss %xmm1, %xmm0 xorps %xmm1, %xmm1 maxss %xmm1, %xmm0 cvttss2si %xmm0, %eax andl $65535, %eax ret This is just a first step, it can be extended in many ways. Testcase here: Transforms/InstCombine/vec_demanded_elts.ll llvm-svn: 30752	2006-10-05 06:55:50 +00:00
Chris Lattner	52886e72d7	This case isn't implemented yet. It seems unlikely to be needed, but if it ever is, we want to get an assert instead of silent bad codegen. llvm-svn: 30716	2006-10-04 04:58:58 +00:00
Nick Lewycky	58a910dff5	Simplify logic further. Ensure that we copy KnownProperties before calling visitBasicBlock, else we may leak properties into blocks where they don't belong. llvm-svn: 30705	2006-10-03 17:36:01 +00:00
Nick Lewycky	1d00f3e144	Simplify, now that predsimplify depends on break-crit-edges. Fix SwitchInst where dest-block is the same as one of the cases. llvm-svn: 30700	2006-10-03 15:19:11 +00:00
Nick Lewycky	755f801adc	Move break-crit-edges before the predicate simplifier. Allows us to optimize in more cases. llvm-svn: 30699	2006-10-03 14:52:23 +00:00
Evan Cheng	ff510a58c2	Revert previous patch. Still breaking things. llvm-svn: 30698	2006-10-03 07:26:07 +00:00
Chris Lattner	8aca0ee8c3	Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll: The critical edge block dominates the dest block if the destblock dominates all edges other than the one incoming from the critical edge. llvm-svn: 30696	2006-10-03 07:02:02 +00:00
Chris Lattner	7d19067c42	Fix a bug from r1.391 of this file, where we checked the size instead of the alignment when promoting allocations. This implements InstCombine/cast.ll:test32 llvm-svn: 30682	2006-10-01 19:40:58 +00:00
Chris Lattner	4797c891c0	Fix debug output llvm-svn: 30680	2006-09-30 23:32:50 +00:00
Chris Lattner	24d3d4280a	Implement SRA of heap allocations. llvm-svn: 30679	2006-09-30 23:32:09 +00:00
Chris Lattner	80a01ef6f0	Add some ifdef'd out debug info llvm-svn: 30676	2006-09-30 19:40:30 +00:00
Chris Lattner	6ab03f6a08	Eliminate ConstantBool::True and ConstantBool::False. Instead, provide ConstantBool::getTrue() and ConstantBool::getFalse(). llvm-svn: 30665	2006-09-28 23:35:22 +00:00
Owen Anderson	7cb6809c25	Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg. llvm-svn: 30657	2006-09-28 23:02:22 +00:00
Chris Lattner	525804f31e	simplify code llvm-svn: 30656	2006-09-28 22:58:25 +00:00
Chris Lattner	e03ca2ca4a	set DEBUG_TYPE right llvm-svn: 30623	2006-09-27 04:58:23 +00:00
Nick Lewycky	059c79264f	Style changes only. Remove dead code, fix a comment. llvm-svn: 30588	2006-09-23 15:13:08 +00:00
Chris Lattner	6bd6da4097	Be far more careful when splitting a loop header, either to form a preheader or when splitting loops with a common header into multiple loops. In particular the old code would always insert the preheader before the old loop header. This is disasterous in cases where the loop hasn't been rotated. For example, it can produce code like: .. outside the loop... jmp LBB1_2 #bb13.outer LBB1_1: #bb1 movsd 8(%esp,%esi,8), %xmm1 mulsd (%edi), %xmm1 addsd %xmm0, %xmm1 addl $24, %edi incl %esi jmp LBB1_3 #bb13 LBB1_2: #bb13.outer leal (%edx,%eax,8), %edi pxor %xmm1, %xmm1 xorl %esi, %esi LBB1_3: #bb13 movapd %xmm1, %xmm0 cmpl $4, %esi jl LBB1_1 #bb1 Note that the loop body is actually LBB1_1 + LBB1_3, which means that the loop now contains an uncond branch WITHIN it to jump around the inserted loop header (LBB1_2). Doh. This patch changes the preheader insertion code to insert it in the right spot, producing this code: ... outside the loop, fall into the header ... LBB1_1: #bb13.outer leal (%edx,%eax,8), %esi pxor %xmm0, %xmm0 xorl %edi, %edi jmp LBB1_3 #bb13 LBB1_2: #bb1 movsd 8(%esp,%edi,8), %xmm0 mulsd (%esi), %xmm0 addsd %xmm1, %xmm0 addl $24, %esi incl %edi LBB1_3: #bb13 movapd %xmm0, %xmm1 cmpl $4, %edi jl LBB1_2 #bb1 Totally crazy, no branch in the loop! :) llvm-svn: 30587	2006-09-23 08:19:21 +00:00
Chris Lattner	608cd05e3f	Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not reachable, making it general purpose enough for use by InsertPreheaderForLoop. Eliminate custom dominfo updating code in InsertPreheaderForLoop, using UpdateDomInfoForRevectoredPreds instead. llvm-svn: 30586	2006-09-23 07:40:52 +00:00
Chris Lattner	51c95cdd82	Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll llvm-svn: 30555	2006-09-21 05:12:20 +00:00
Nick Lewycky	fde9c308b2	Don't rewrite ConstantExpr::get. llvm-svn: 30552	2006-09-21 01:05:35 +00:00
Nick Lewycky	d74c55f483	Once we're down to "setcc type constant1, constant2", at least come up with the right answer. llvm-svn: 30550	2006-09-20 23:02:24 +00:00
Nick Lewycky	cfff1c3f86	Use a total ordering to compare instructions. Fixes infinite loop in resolve(). llvm-svn: 30540	2006-09-20 17:04:01 +00:00
Andrew Lenharth	44cb67af5c	simplify llvm-svn: 30535	2006-09-20 15:37:57 +00:00
Chris Lattner	380c7e9a59	We went through all that trouble to compute whether it was safe to transform this comparison, but never checked it. Whoops, no wonder we miscompiled 177.mesa! llvm-svn: 30511	2006-09-20 04:44:59 +00:00
Evan Cheng	cd3f6ff0e5	Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. llvm-svn: 30505	2006-09-20 01:39:40 +00:00
Evan Cheng	453280b94d	80 col. llvm-svn: 30504	2006-09-20 01:10:02 +00:00
Andrew Lenharth	4f339bebb0	If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness llvm-svn: 30496	2006-09-19 18:24:51 +00:00
Chris Lattner	12f52faf93	implement select.ll:test19-22 llvm-svn: 30482	2006-09-19 06:18:21 +00:00
Nick Lewycky	b9c5483a93	Walk down the dominator tree instead of the control flow graph. That means that we can't modify the CFG any more, at least not until it's possible to update the dominator tree (PR217). llvm-svn: 30469	2006-09-18 21:09:35 +00:00
Chris Lattner	de07792595	Fix an infinite loop building the CFE llvm-svn: 30465	2006-09-18 18:27:05 +00:00
Chris Lattner	67a35bbce7	Implement a trivial optzn: of vastart is never called in a function that takes ... args, remove the '...'. This is Transforms/DeadArgElim/dead_vaargs.ll llvm-svn: 30459	2006-09-18 07:02:31 +00:00
Chris Lattner	4922a0e53f	Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%. llvm-svn: 30456	2006-09-18 05:27:43 +00:00
Chris Lattner	420c4bcc8d	Implement Transforms/InstCombine/shift-sra.ll:test0 llvm-svn: 30450	2006-09-18 04:31:40 +00:00
Chris Lattner	b3f24c91b0	Rewrite shift/and/compare sequences to promote better licm of the RHS. Use isLogicalShift/isArithmeticShift to simplify code. llvm-svn: 30448	2006-09-18 04:22:48 +00:00
Chris Lattner	850465d53f	Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913 llvm-svn: 30405	2006-09-16 03:14:10 +00:00
Chris Lattner	9482cc5b16	revert previous two patches. They cause miscompilation of MultiSource/Applications/Burg llvm-svn: 30397	2006-09-15 17:24:45 +00:00
Owen Anderson	edadd3faee	Revert my previous work on ArgumentPromotion. Further investigation has revealed these changes to be incorrect. They just weren't showing up in any of our current testcases. llvm-svn: 30385	2006-09-15 05:22:51 +00:00
Anton Korobeynikov	d61d39ec53	Adding dllimport, dllexport and external weak linkage types. DLL* linkages got full (I hope) codegeneration support in C & both x86 assembler backends. External weak linkage added for future use, we don't provide any codegeneration, etc. support for it. llvm-svn: 30374	2006-09-14 18:23:27 +00:00
Chris Lattner	237ccf2a51	Second half of the fix for Transforms/Inline/inline_cleanup.ll This folds unconditional branches that are often produced by code specialization. llvm-svn: 30307	2006-09-13 21:27:00 +00:00
Nick Lewycky	12efffc96b	Add some more consistency checks. llvm-svn: 30305	2006-09-13 19:32:53 +00:00
Nick Lewycky	51ce8d6b46	Fix unionSets so that it can merge correctly. llvm-svn: 30304	2006-09-13 19:24:01 +00:00
Chris Lattner	6ef6d06d21	Implement the first half of Transforms/Inline/inline_cleanup.ll llvm-svn: 30303	2006-09-13 19:23:57 +00:00
Nick Lewycky	3a4dc7b489	Erase dead instructions. llvm-svn: 30298	2006-09-13 18:55:37 +00:00
Devang Patel	fab4972a6e	Initialize DontInternalize. llvm-svn: 30281	2006-09-13 01:02:26 +00:00
Chris Lattner	1d7ec20a4d	An sinkable instruction may exist with uses, if those uses are in dead blocks. Handle this. This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll llvm-svn: 30275	2006-09-12 19:17:09 +00:00
Chris Lattner	d28627009a	Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll llvm-svn: 30266	2006-09-11 21:43:16 +00:00
Nick Lewycky	e94f42a740	Skip the linear search if the answer is already known. llvm-svn: 30251	2006-09-11 17:23:34 +00:00

1 2 3 4 5 ...

2568 Commits