302 Commits

Author SHA1 Message Date
Chris Lattner
bfca1ab79d Rename PowerPC*.h to PPC*.h
llvm-svn: 23743
2005-10-14 23:51:18 +00:00
Chris Lattner
0921e3bfc1 Eliminate PowerPC.td and PPC32.td, consolidating them into PPC.td
llvm-svn: 23738
2005-10-14 23:37:35 +00:00
Chris Lattner
7d9f719d42 These are now autogenerated
llvm-svn: 23731
2005-10-14 06:26:29 +00:00
Chris Lattner
89c7fa22b1 Disable formation of rlwinm instructions from SRA bases. This fixes
the 177.mesa failure from last night, and fixes the
CodeGen/PowerPC/2005-10-08-ArithmeticRotate.ll regression test I added.
If this code cannot be fixed, it should be removed for good, but I'll leave
it to Nate to decide its fate.

llvm-svn: 23670
2005-10-09 05:36:17 +00:00
Chris Lattner
dae96f8881 When preselecting, favor things that have low depth to select first. This
is faster and uses less stack space.  This reduces our stack requirement
enough to compile sixtrack, and though it's a hack, should be enough until
we switch to iterative isel

llvm-svn: 23664
2005-10-07 22:10:27 +00:00
Chris Lattner
318622fb9f Pull out Call, reducing stack frame size from 6032 bytes to 5184 bytes.
llvm-svn: 23650
2005-10-06 19:07:45 +00:00
Chris Lattner
491b8294f4 Pull out setcc, this reduces stack frame size from 7520 to 6032 bytes
llvm-svn: 23649
2005-10-06 19:03:35 +00:00
Chris Lattner
502a36935e Pull two more methods out, reducing stack frame size from 8224 -> 7520 bytes
llvm-svn: 23648
2005-10-06 18:56:10 +00:00
Chris Lattner
259e6c76f2 Add a recursive-iterative hybrid stage to attempt to reduce stack space, this
helps but not enough.

Start pulling cases out of PPC32DAGToDAGISel::Select.  With GCC 4, this function
required 8512 bytes of stack space for each invocation (GCC 3 required less
than 700 bytes).  Pulling this first function out gets us down to 8224.  More
to come :(

llvm-svn: 23647
2005-10-06 18:45:51 +00:00
Chris Lattner
3734d204b8 another solution to the fsel issue. Instead of having 4 variants, just force
the comparison to be 64-bits.  This is fine because extensions from float
to double are free.

llvm-svn: 23589
2005-10-02 07:07:49 +00:00
Chris Lattner
9e98672962 fsel can take a different FP type for the comparison and for the result. As such
split the FSEL family into 4 things instead of just two.

llvm-svn: 23588
2005-10-02 06:58:23 +00:00
Chris Lattner
5ab9d42bb4 Minor tweak to the branch selector. When emitting a two-way branch, and if
we're in a single-mbb loop, make sure to emit the backwards branch as the
conditional branch instead of the uncond branch.  For example, emit this:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        ble cr0, LBBl29_z__44
        b LBBl29_z__48                   *** NOT PART OF LOOP

Instead of:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        bgt cr0, LBBl29_z__48            *** PART OF LOOP!
        b LBBl29_z__44

The former sequence has one fewer dispatch group for the loop body.

llvm-svn: 23582
2005-10-01 23:06:26 +00:00
Chris Lattner
8713ebf37c fix typo
llvm-svn: 23578
2005-10-01 02:51:36 +00:00
Chris Lattner
d3eee1a09b Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
These are used to represent float and double values, and the two regclasses
contain the same physical registers.

llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Jim Laskey
f61232354f Should be using flag and not chain.
llvm-svn: 23572
2005-09-30 23:43:37 +00:00
Chris Lattner
1de5706e68 Remove code for patterns that are autogenerated
llvm-svn: 23532
2005-09-29 23:33:31 +00:00
Chris Lattner
08c319fbdd Never rely on ReplaceAllUsesWith when selecting, use CodeGenMap instead.
ReplaceAllUsesWith does not replace scalars SDOperand floating around on
the stack, permitting things to be selected multiple times.

llvm-svn: 23515
2005-09-29 00:59:32 +00:00
Chris Lattner
b9b2e77295 Autogen MUL, move FP cases together
llvm-svn: 23512
2005-09-28 22:53:16 +00:00
Chris Lattner
5769311c92 disentangle FP from INT versions of div/mul
llvm-svn: 23511
2005-09-28 22:50:24 +00:00
Chris Lattner
585131baaf Use the autogenerated matcher for ADD/SUB
llvm-svn: 23510
2005-09-28 22:47:28 +00:00
Chris Lattner
d3ea19b51a Add FP versions of the binary operators, keeping the int and fp worlds seperate.
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner
fab48b3285 All (xor *) cases are autogenerated now
llvm-svn: 23497
2005-09-28 18:12:37 +00:00
Chris Lattner
33f8e08c8f Implement PowerPC/eqv-andc-orc-nor.ll:EQV3
llvm-svn: 23494
2005-09-28 18:04:52 +00:00
Chris Lattner
bb5939a436 These nodes are all autogenerated
llvm-svn: 23489
2005-09-28 17:07:09 +00:00
Chris Lattner
c628f00845 Make sure to clear the CodeGenMap after each basic block is selected to avoid
cross MBB pollution.

llvm-svn: 23470
2005-09-27 17:45:33 +00:00
Chris Lattner
b011cb2746 we don't need this proto any longer
llvm-svn: 23342
2005-09-13 22:05:21 +00:00
Chris Lattner
03e08eefc7 move the #include for the generated code into the isel class body so we
can use/define class methods

llvm-svn: 23339
2005-09-13 22:03:06 +00:00
Chris Lattner
4309c3a785 PowerPC cannot truncstore i1 natively
llvm-svn: 23304
2005-09-10 00:21:06 +00:00
Chris Lattner
498915dafa Remove some cases handled by the generated portion of the isel
llvm-svn: 23262
2005-09-07 23:45:15 +00:00
Nate Begeman
6095214bf0 Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.

This speeds up 189.lucas from 81.99 to 32.64 seconds.

llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Chris Lattner
8ae9525bd0 include the dag isel fragment
llvm-svn: 23239
2005-09-03 01:17:22 +00:00
Chris Lattner
5f12cf14be Change the isel to not break out of the big giant switch. Instead, the
switch should never be exited, so its bottom is now unreachable.

llvm-svn: 23234
2005-09-03 00:53:47 +00:00
Chris Lattner
a305d28cf6 Implement dynamic allocas correctly. In particular, because we were copying
directly out of R1 (without using a CopyFromReg, which uses a chain), multiple
allocas were getting CSE'd together, producing bogus code.  For this:

int %foo(bool %X, int %A, int %B) {
        br bool %X, label %T, label %F
F:
        %G = alloca int
        %H = alloca int
        store int %A, int* %G
        store int %B, int* %H
        %R = load int* %G
        ret int %R
T:
        ret int 0
}

We were generating:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        li r2, 16
        subf r2, r2, r1   ;; One alloca
        or r1, r2, r2
        or r3, r1, r1
        or r1, r2, r2
        or r2, r1, r1
        stw r4, 0(r3)
        stw r5, 0(r2)
        lwz r3, 0(r3)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

Now we generate:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        or r2, r1, r1
        li r3, 16
        subf r2, r3, r2  ;; Alloca 1
        or r1, r2, r2
        or r2, r1, r1
        or r6, r1, r1
        subf r3, r3, r6  ;; Alloca 2
        or r1, r3, r3
        or r3, r1, r1
        stw r4, 0(r2)
        stw r5, 0(r3)
        lwz r3, 0(r2)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

This fixes Povray and SPASS with the dag isel, the last two failing cases.
Tommorow we will hopefully turn it on by default! :)

llvm-svn: 23190
2005-09-01 21:31:30 +00:00
Chris Lattner
293b3a68e0 Fix a bug where we were useing HA to get the high part, which seems like it
could cause a miscompile.  Fixing this didn't fix the two programs that fail
though.  :(

This also changes the implementation to follow the pattern selector more
closely, causing us to select 0 to li instead of lis.

llvm-svn: 23189
2005-09-01 19:38:28 +00:00
Chris Lattner
34182aff7f Do not select the operands being passed into SelectCC. IT does this itself
and selecting early prevents folding immediates into the cmpw* instructions

llvm-svn: 23188
2005-09-01 19:20:44 +00:00
Chris Lattner
da2e04c69d Move FCTIWZ handling out of the instruction selectors and into legalization,
getting them out of the business of making stack slots.

llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner
6bad1fb19e Remove dead code
llvm-svn: 23179
2005-08-31 20:25:15 +00:00
Chris Lattner
2bd2af8ecd add assert zext/sext to the dag isel
llvm-svn: 23171
2005-08-31 18:08:46 +00:00
Chris Lattner
f4d594370b Fix 'ret long' to return the high and lo parts in the right registers. This
fixes crafty and probably others.

llvm-svn: 23167
2005-08-31 01:34:29 +00:00
Chris Lattner
69e9a9a94c now that physregs can exist in the same dag with multiple types, remove some
ugly hacks

llvm-svn: 23162
2005-08-30 22:59:48 +00:00
Chris Lattner
8f8d539746 Fix type mismatches when passing f32 values to calls
llvm-svn: 23159
2005-08-30 21:28:19 +00:00
Chris Lattner
9f23ae226f Fix some indentation (first hunks).
Remove code (last hunk) that miscompiled immediate and's, such as
  and uint %tmp.30, 4294958079

into

 andi. r8, r8, 56319
 andis. r8, r8, 65535

instead of:

 li r9, -9217
 and r8, r8, r9

The first always generates zero.

This fixes espresso.

llvm-svn: 23155
2005-08-30 18:37:48 +00:00
Chris Lattner
6a41fd75cd Fix a problem Nate found where we swapped the operands of SHL/SHR_PARTS. This
fixes fourinarow

llvm-svn: 23153
2005-08-30 17:42:59 +00:00
Chris Lattner
bdf3d3defb codegen ADD_PARTS correctly: put the results in the right registers! This
fixes fhourstones

llvm-svn: 23152
2005-08-30 17:40:13 +00:00
Chris Lattner
45706e9fb8 add operands in the right order, fixing McCat/18-imp with the dag isel
llvm-svn: 23150
2005-08-30 17:13:58 +00:00
Chris Lattner
7a59b1cf90 Make sure the selector emits register register copies with flag operands
linking them to calls when appropriate, this prevents the scheduler from
pulling these copies away from the call.

This fixes Ptrdist/yacr2

llvm-svn: 23143
2005-08-30 01:57:02 +00:00
Chris Lattner
e413b60632 The first operand to AND does not always have more than two operands. This
fixes MediaBench/toast with the dag selector

llvm-svn: 23141
2005-08-30 00:59:16 +00:00
Chris Lattner
61f7c3e843 emit FMR instructions to convert f64<->f32 instructions, so things like
STOREs, know the right type to store.

llvm-svn: 23139
2005-08-30 00:30:43 +00:00
Chris Lattner
12357281b8 fix a crash in cfrac
llvm-svn: 23137
2005-08-29 23:49:25 +00:00
Chris Lattner
1cbbe1015a Implement DYNAMIC_STACKALLOC, wrap some long lines
llvm-svn: 23136
2005-08-29 23:30:11 +00:00