llvm-project

Author	SHA1	Message	Date
Chris Lattner	2deeaeaca7	add a new SimplifyDemandedVectorElts method, which works similarly to SimplifyDemandedBits. The idea is that some operations can be simplified if not all of the computed elements are needed. Some targets (like x86) have a large number of intrinsics that operate on a single element, but pass other elts through unmodified. If those other elements are not needed, the intrinsics can be simplified to scalar operations, and insertelement ops can be removed. This turns (f.e.): ushort %Convert_sse(float %f) { %tmp = insertelement <4 x float> undef, float %f, uint 0 ; <<4 x float>> [#uses=1] %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1 ; <<4 x float>> [#uses=1] %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2 ; <<4 x float>> [#uses=1] %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3 ; <<4 x float>> [#uses=1] %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } into: ushort %Convert_sse(float %f) { entry: %tmp28 = sub float %f, 1.000000e+00 ; <float> [#uses=1] %tmp37 = mul float %tmp28, 5.000000e-01 ; <float> [#uses=1] %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0 ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } which improves codegen from: _Convert_sse: movss LCPI1_0, %xmm0 movss 4(%esp), %xmm1 subss %xmm0, %xmm1 movss LCPI1_1, %xmm0 mulss %xmm0, %xmm1 movss LCPI1_2, %xmm0 minss %xmm0, %xmm1 xorps %xmm0, %xmm0 maxss %xmm0, %xmm1 cvttss2si %xmm1, %eax andl $65535, %eax ret to: _Convert_sse: movss 4(%esp), %xmm0 subss LCPI1_0, %xmm0 mulss LCPI1_1, %xmm0 movss LCPI1_2, %xmm1 minss %xmm1, %xmm0 xorps %xmm1, %xmm1 maxss %xmm1, %xmm0 cvttss2si %xmm0, %eax andl $65535, %eax ret This is just a first step, it can be extended in many ways. Testcase here: Transforms/InstCombine/vec_demanded_elts.ll llvm-svn: 30752	2006-10-05 06:55:50 +00:00
Chris Lattner	7d19067c42	Fix a bug from r1.391 of this file, where we checked the size instead of the alignment when promoting allocations. This implements InstCombine/cast.ll:test32 llvm-svn: 30682	2006-10-01 19:40:58 +00:00
Chris Lattner	6ab03f6a08	Eliminate ConstantBool::True and ConstantBool::False. Instead, provide ConstantBool::getTrue() and ConstantBool::getFalse(). llvm-svn: 30665	2006-09-28 23:35:22 +00:00
Andrew Lenharth	44cb67af5c	simplify llvm-svn: 30535	2006-09-20 15:37:57 +00:00
Chris Lattner	380c7e9a59	We went through all that trouble to compute whether it was safe to transform this comparison, but never checked it. Whoops, no wonder we miscompiled 177.mesa! llvm-svn: 30511	2006-09-20 04:44:59 +00:00
Evan Cheng	cd3f6ff0e5	Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. llvm-svn: 30505	2006-09-20 01:39:40 +00:00
Evan Cheng	453280b94d	80 col. llvm-svn: 30504	2006-09-20 01:10:02 +00:00
Andrew Lenharth	4f339bebb0	If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness llvm-svn: 30496	2006-09-19 18:24:51 +00:00
Chris Lattner	12f52faf93	implement select.ll:test19-22 llvm-svn: 30482	2006-09-19 06:18:21 +00:00
Chris Lattner	de07792595	Fix an infinite loop building the CFE llvm-svn: 30465	2006-09-18 18:27:05 +00:00
Chris Lattner	4922a0e53f	Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%. llvm-svn: 30456	2006-09-18 05:27:43 +00:00
Chris Lattner	420c4bcc8d	Implement Transforms/InstCombine/shift-sra.ll:test0 llvm-svn: 30450	2006-09-18 04:31:40 +00:00
Chris Lattner	b3f24c91b0	Rewrite shift/and/compare sequences to promote better licm of the RHS. Use isLogicalShift/isArithmeticShift to simplify code. llvm-svn: 30448	2006-09-18 04:22:48 +00:00
Chris Lattner	850465d53f	Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913 llvm-svn: 30405	2006-09-16 03:14:10 +00:00
Chris Lattner	d28627009a	Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll llvm-svn: 30266	2006-09-11 21:43:16 +00:00
Chris Lattner	0468987592	Implement Transforms/InstCombine/hoist_instr.ll llvm-svn: 30234	2006-09-09 22:02:56 +00:00
Chris Lattner	d79dc79831	Turn div X, (Cond ? Y : 0) -> div X, Y This implements select.ll::test18. llvm-svn: 30230	2006-09-09 20:26:32 +00:00
Chris Lattner	c2d3d3112e	eliminate RegisterOpt. It does the same thing as RegisterPass. llvm-svn: 29925	2006-08-27 22:42:52 +00:00
Chris Lattner	3d27be1333	s\|llvm/Support/Visibility.h\|llvm/Support/Compiler.h\| llvm-svn: 29911	2006-08-27 12:54:02 +00:00
Chris Lattner	091b6ea847	Silence a warning produced in assertions-disabled mode llvm-svn: 29108	2006-07-11 18:31:26 +00:00
Owen Anderson	bbf8990ef7	Add a comment, and fix a typo that broke the build. llvm-svn: 29094	2006-07-10 22:15:25 +00:00
Owen Anderson	ae8aa646f1	Don't indent the entire function. llvm-svn: 29093	2006-07-10 22:03:18 +00:00
Chris Lattner	b7845d69db	Recognize 16-bit bswaps by relaxing overconstrained pattern. This implements Transforms/InstCombine/bswap.ll:test[34]. llvm-svn: 29087	2006-07-10 20:25:24 +00:00
Owen Anderson	a6968f83b2	Make instcombine not remove Phi nodes when LCSSA is live. llvm-svn: 29083	2006-07-10 19:03:49 +00:00
Chris Lattner	4a4c7fe7fa	Shrink libllvmgcc.dylib by another 23K llvm-svn: 28972	2006-06-28 22:08:15 +00:00
Chris Lattner	3fda386965	Fix Transforms/InstCombine/2006-06-28-infloop.ll llvm-svn: 28961	2006-06-28 17:34:50 +00:00
Andrew Lenharth	ebfa24ee9a	Catch more function pointer casting problems Remove the Function pointer cast in these calls, converting it to a cast of argument. %tmp60 = tail call int cast (int (ulong)* %str to int (int))( int 10 ) %tmp60 = tail call int cast (int (ulong) %str to int (int)*)( uint %tmp51 ) llvm-svn: 28953	2006-06-28 01:01:52 +00:00
Chris Lattner	c482a9e31a	Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap idioms into bswap intrinsics. llvm-svn: 28803	2006-06-15 19:07:26 +00:00
Chris Lattner	95cebb082f	Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on PPC/altivec llvm-svn: 28698	2006-06-06 22:26:02 +00:00
Chris Lattner	1df0e98ac2	Swap the order of operands created here. For +&\|^, the order doesn't matter, but for sub, it really does! Fix fixes a miscompilation of fibheap_cut in llvmgcc4. llvm-svn: 28600	2006-05-31 21:14:00 +00:00
Chris Lattner	dab43b2b0e	Implement Transforms/InstCombine/store.ll:test2. llvm-svn: 28503	2006-05-26 19:19:20 +00:00
Chris Lattner	0e47716e69	Transform things like (splat(splat)) -> splat llvm-svn: 28490	2006-05-26 00:29:06 +00:00
Chris Lattner	12249be286	Introduce a helper function that simplifies interpretation of shuffle masks. No functionality change. llvm-svn: 28489	2006-05-25 23:48:38 +00:00
Chris Lattner	99155be33f	Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in the program. This exposes more opportunities for the instcombiner, and implements vec_shuffle.ll:test6 llvm-svn: 28487	2006-05-25 23:24:33 +00:00
Chris Lattner	83f6578b0c	extract element from a shuffle vector can be trivially turned into an extractelement from the SV's source. This implement vec_shuffle.ll:test[45] llvm-svn: 28485	2006-05-25 22:53:38 +00:00
Chris Lattner	d0622b6894	Silence a bogus gcc warning llvm-svn: 28422	2006-05-20 23:14:03 +00:00
Evan Cheng	18d0438148	Backing out last check-in for now. It's causing an infinite loop gccas lencode. llvm-svn: 28284	2006-05-14 06:46:03 +00:00
Chris Lattner	3987a8532d	Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit bitfield now gives this code: _plus: lwz r2, 0(r3) rlwimi r2, r2, 0, 1, 31 xoris r2, r2, 32768 stw r2, 0(r3) blr instead of this: _plus: lwz r2, 0(r3) srwi r4, r2, 31 slwi r4, r4, 31 addis r4, r4, -32768 rlwimi r2, r4, 0, 0, 0 stw r2, 0(r3) blr this can obviously still be improved. llvm-svn: 28275	2006-05-13 02:16:08 +00:00
Chris Lattner	1ebbe6a22e	Implement simple promotion for cast elimination in instcombine. This is currently very limited, but can be extended in the future. For example, we now compile: uint %test30(uint %c1) { %c2 = cast uint %c1 to ubyte %c3 = xor ubyte %c2, 1 %c4 = cast ubyte %c3 to uint ret uint %c4 } to: _xor: movzbl 4(%esp), %eax xorl $1, %eax ret instead of: _xor: movb $1, %al xorb 4(%esp), %al movzbl %al, %eax ret More impressively, we now compile: struct B { unsigned bit : 1; }; void xor(struct B *b) { b->bit = b->bit ^ 1; } To (X86/PPC): _xor: movl 4(%esp), %eax xorl $-2147483648, (%eax) ret _xor: lwz r2, 0(r3) xoris r2, r2, 32768 stw r2, 0(r3) blr instead of (X86/PPC): _xor: movl 4(%esp), %eax movl (%eax), %ecx movl %ecx, %edx shrl $31, %edx # TRUNCATE movb %dl, %dl xorb $1, %dl movzbl %dl, %edx andl $2147483647, %ecx shll $31, %edx orl %ecx, %edx movl %edx, (%eax) ret _xor: lwz r2, 0(r3) srwi r4, r2, 31 xori r4, r4, 1 rlwimi r2, r4, 31, 0, 0 stw r2, 0(r3) blr This implements InstCombine/cast.ll:test30. llvm-svn: 28273	2006-05-13 02:06:03 +00:00
Chris Lattner	1443bc52be	Refactor some code, making it simpler. When doing the initial pass of constant folding, if we get a constantexpr, simplify the constant expr like we would do if the constant is folded in the normal loop. This fixes the missed-optimization regression in Transforms/InstCombine/getelementptr.ll last night. llvm-svn: 28224	2006-05-11 17:11:52 +00:00
Chris Lattner	a36ee4ea34	Two changes: 1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable blocks (due to constants in conditional branches/switches) to the worklist. This causes them to be deleted before instcombine starts up, leading to better optimization. 2. In the prepass over instructions, do trivial constprop/dce as we go. This has the effect of improving the effectiveness of #1. In addition, it significantly speeds up instcombine on test cases with large amounts of constant folding code (for example, that produced by code specialization or partial evaluation). In one example, it speeds up instcombine from 0.0589s to 0.0224s with a release build (a 2.6x speedup). llvm-svn: 28215	2006-05-10 19:00:36 +00:00
Chris Lattner	1d441adfbf	Move some code around. Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation only apply when both casts really will cause code to be generated. If one or both doesn't, then this xform doesn't remove a cast. This fixes Transforms/InstCombine/2006-05-06-Infloop.ll llvm-svn: 28141	2006-05-06 09:00:16 +00:00
Chris Lattner	e745c7de0e	Fix an infinite loop compiling oggenc last night. llvm-svn: 28128	2006-05-05 20:51:30 +00:00
Chris Lattner	3af1053488	Implement InstCombine/cast.ll:test29 llvm-svn: 28126	2006-05-05 06:39:07 +00:00
Chris Lattner	fb29692055	Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll llvm-svn: 28101	2006-05-04 17:33:35 +00:00
Chris Lattner	655d08fda8	Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll llvm-svn: 28019	2006-04-28 22:21:41 +00:00
Chris Lattner	b6cb64b7e6	Add support for inserting undef into a vector. This implements Transforms/InstCombine/vec_insert_to_shuffle.ll llvm-svn: 27997	2006-04-27 21:14:21 +00:00
Andrew Lenharth	f89e630b2f	Make code match cvs commit message :) llvm-svn: 27881	2006-04-20 15:41:37 +00:00
Andrew Lenharth	61eae29ad6	If we can convert the return pointer type into an integer that IntPtrType can be converted to losslessly, we can continue the conversion to a direct call. llvm-svn: 27880	2006-04-20 14:56:47 +00:00
Chris Lattner	36dd7c98d1	Turn x86 unaligned load/store intrinsics into aligned load/store instructions if the pointer is known aligned. llvm-svn: 27781	2006-04-17 22:26:56 +00:00

1 2 3 4 5 ...

518 Commits