www: Finish first draft of the matmul example

llvm-svn: 130751
This commit is contained in:
Tobias Grosser 2011-05-03 09:40:40 +00:00
parent c30448222a
commit e79a5e65c0
4 changed files with 197 additions and 52 deletions

View File

@ -20,14 +20,15 @@
<p>Polly does not yet focus on end user, but on research and the development of <p>Polly does not yet focus on end user, but on research and the development of
new optimizations. Hence for the users of Polly it is often necessary to new optimizations. Hence for the users of Polly it is often necessary to
understand how Polly works internally. To get an overview of the different steps understand how Polly works internally. To get an to know the different steps
taken during polyhedral compilation, we give a step by step example on how to taken during polyhedral compilation, we give a step by step example on how to
use the different Polly passes. For this we optimize a simple matrix use the different Polly passes. For this we optimize a simple matrix
multiplication kernel. In case you look for a more automated way of executing multiplication kernel. In case you look for a more automated way of executing
Polly, check out the pollycc tool in utils/pollycc.</p> Polly, check out the pollycc tool in utils/pollycc.</p>
The files used and created in this example are available <a The files used and created in this example are available <a
href="experiments/matmul">here</a>. href="experiments/matmul">here</a>. They can be created automatically by running
the <a href="experiments/matmul/runall.sh">runall.sh</a> script.
<ol> <ol>
<li><h4>Create LLVM-IR from the C code</h4> <li><h4>Create LLVM-IR from the C code</h4>
@ -57,14 +58,14 @@ alias opt="opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so"</pre>
Polly is only able to work with code that matches a canonical form. To translate Polly is only able to work with code that matches a canonical form. To translate
the LLVM-IR into this form we use a set of canonicalication passes. For this the LLVM-IR into this form we use a set of canonicalication passes. For this
example only three passes are necessary. To get good coverage on a larger set example only three passes are necessary. To get good coverage on more
of input files a larger set is needed. pollycc contains a set of passes that has complicated input files often more canonicalization passes are needed. pollycc
shown to be beneficial. contains a list of passes that have shown to be beneficial.
<pre class="code">opt -S -mem2reg -loop-simplify -indvars matmul.s &gt; matmul.preopt.ll</pre></li> <pre class="code">opt -S -mem2reg -loop-simplify -indvars matmul.s &gt; matmul.preopt.ll</pre></li>
<li><h4>Show the SCoPs detected by Polly (optional)</h4> <li><h4>Show the SCoPs detected by Polly (optional)</h4>
To understand if Polly was able to detect some SCoPs, we print the To understand if Polly was able to detect SCoPs, we print the
structure of the detected SCoPs. In our example two SCoPs were detected. One in structure of the detected SCoPs. In our example two SCoPs were detected. One in
'init_array' the other in 'main'. 'init_array' the other in 'main'.
@ -112,7 +113,8 @@ view-scops-only:
<pre class="code">opt -basicaa -polly-scops -analyze matmul.preopt.ll</pre> <pre class="code">opt -basicaa -polly-scops -analyze matmul.preopt.ll</pre>
<pre> <pre>
[...] [...]
Printing analysis 'Polly - Create polyhedral description of Scops' for region: '%1 =&gt;&nbsp;%17' in function 'init_array': Printing analysis 'Polly - Create polyhedral description of Scops' for region:
'%1 =&gt;&nbsp;%17' in function 'init_array':
Context: Context:
{ [] } { [] }
Statements { Statements {
@ -135,9 +137,9 @@ Printing analysis 'Polly - Create polyhedral description of Scops' for region: '
ReadAccess&nbsp;:= ReadAccess&nbsp;:=
{ FinalRead[i0] -&gt; MemRef_B[o0] }; { FinalRead[i0] -&gt; MemRef_B[o0] };
} }
Printing analysis 'Polly - Create polyhedral description of Scops' for region: '%0 =&gt; &lt;Function Return&gt;' in function 'init_array':
[...] [...]
Printing analysis 'Polly - Create polyhedral description of Scops' for region: '%1 =&gt;&nbsp;%17' in function 'main': Printing analysis 'Polly - Create polyhedral description of Scops' for region:
'%1 =&gt;&nbsp;%17' in function 'main':
Context: Context:
{ [] } { [] }
Statements { Statements {
@ -173,14 +175,14 @@ Printing analysis 'Polly - Create polyhedral description of Scops' for region: '
ReadAccess&nbsp;:= ReadAccess&nbsp;:=
{ FinalRead[i0] -&gt; MemRef_B[o0] }; { FinalRead[i0] -&gt; MemRef_B[o0] };
} }
Printing analysis 'Polly - Create polyhedral description of Scops' for region: '%0 =&gt; &lt;Function Return&gt;' in function 'main': [...]
Invalid Scop!
</pre> </pre>
</li> </li>
<li><h4>Show the dependences for the SCoPs</h4> <li><h4>Show the dependences for the SCoPs</h4>
<pre class="code">opt -basicaa -polly-dependences -analyze matmul.preopt.ll</pre> <pre class="code">opt -basicaa -polly-dependences -analyze matmul.preopt.ll</pre>
<pre>Printing analysis 'Polly - Calculate dependences for SCoP' for region: 'for.cond =&gt; for.end28' in function 'init_array': <pre>Printing analysis 'Polly - Calculate dependences for SCoP' for region:
'for.cond =&gt; for.end28' in function 'init_array':
Must dependences: Must dependences:
{ } { }
May dependences: May dependences:
@ -189,7 +191,8 @@ Invalid Scop!
{ } { }
May no source: May no source:
{ } { }
Printing analysis 'Polly - Calculate dependences for SCoP' for region: 'for.cond =&gt; for.end48' in function 'main': Printing analysis 'Polly - Calculate dependences for SCoP' for region:
'for.cond =&gt; for.end48' in function 'main':
Must dependences: Must dependences:
{ Stmt_4[i0, i1] -&gt; Stmt_6[i0, i1, 0]&nbsp;: { Stmt_4[i0, i1] -&gt; Stmt_6[i0, i1, 0]&nbsp;:
i0 &gt;= 0 and i0 &lt;= 1023 and i1 &gt;= 0 and i1 &lt;= 1023; i0 &gt;= 0 and i0 &lt;= 1023 and i1 &gt;= 0 and i1 &lt;= 1023;
@ -228,51 +231,191 @@ Writing SCoP 'for.cond =&gt; for.end48' in function 'main' to './main___%for.con
<li><h4>Import the changed jscop files and print the updated SCoP structure <li><h4>Import the changed jscop files and print the updated SCoP structure
(optional)</h4> (optional)</h4>
<p>Polly can import jscop files, where the schedules of the statements were <p>Polly can reimport jscop files, in which the schedules of the statements are
changed. With the help of these updated files we can import transformations into changed. These changed schedules are used to descripe transformations.
Polly. It is possible to import different jscop files by providing the postfix It is possible to import different jscop files by providing the postfix
of the jscop file that is imported.</p> of the jscop file that is imported.</p>
<p> The optimized jscop files for this example are hand written. The schedule <p> We apply three different transformations on the SCoP in the main function.
used was inspired by looking at the optimizations PoCC performs. If PoCC is The jscop files describing these transformations are hand written. If PoCC is
installed Polly can often calculate such schedules fully automatically.</p> installed Polly can sometimes calculate such schedules fully automatically.
Hwever, this is still an area we are actively working on.</p>
<h5>No Polly</h5>
<pre class="code">opt -basicaa -polly-import-jscop -polly-print -disable-output matmul.preopt.ll -polly-import-jscop-postfix=.opt</pre> <p>As a baseline we do not call any Polly code generation, but only apply the
<pre>Cannot open file: ./init_array___%for.cond---%for.end28.jscop.opt normal -O3 optimizations.</p>
Skipping import.
In function: 'init_array' SCoP: for.cond =&gt; for.end28: <pre class="code">
for (c2=0;c2&lt;=1023;c2++) { opt matmul.preopt.ll -basicaa \
for (c4=0;c4&lt;=1023;c4++) { -polly-import-jscop \
&nbsp;%for.body4(c2,c4); -polly-cloog -analyze
} </pre>
} <pre>
Reading SCoP 'for.cond =&gt; for.end48' in function 'main' from './main___%for.cond---%for.end48.scop.opt.opt'. [...]
In function: 'main' SCoP: for.cond =&gt; for.end48: main():
for (c2=0;c2&lt;=1023;c2++) { for (c2=0;c2&ltg;=1535;c2++) {
for (c4=0;c4&lt;=1023;c4++) { for (c4=0;c4&ltg;=1535;c4++) {
&nbsp;%for.body4(c2,c4); Stmt_4(c2,c4);
} for (c6=0;c6&ltg;=1535;c6++) {
} Stmt_6(c2,c4,c6);
for (c2=0;c2&lt;=1023;c2++) {
for (c3=0;c3&lt;=1023;c3++) {
for (c4=0;c4&lt;=1023;c4++) {
&nbsp;%for.body12(c2,c4,c3);
} }
} }
} }
</pre></li> [...]
</pre>
<h5>Interchange (and Fission to allow the interchange)</h5>
<p>We split the loops and can now apply an interchange of the loop dimensions that
enumerate Stmt_6.</p>
<pre class="code">
opt matmul.preopt.ll -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged \
-polly-cloog -analyze
</pre>
<pre>
[...]
Reading JScop '%1 =&gt; %17' in function 'main' from './main___%1---%17.jscop.interchanged'.
[...]
main():
for (c2=0;c2&lt;=1535;c2++) {
for (c4=0;c4&lt;=1535;c4++) {
Stmt_4(c2,c4);
}
}
for (c2=0;c2&lt;=1535;c2++) {
for (c4=0;c4&lt;=1535;c4++) {
for (c6=0;c6&lt;=1535;c6++) {
Stmt_6(c2,c6,c4);
}
}
}
[...]
</pre>
<h5>Interchange + Tiling</h5>
<p>In addition to the interchange we tile now the second loop nest.</p>
<pre class="code">
opt matmul.preopt.ll -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \
-polly-cloog -analyze
</pre>
<pre>
[...]
Reading JScop '%1 =&gt; %17' in function 'main' from './main___%1---%17.jscop.interchanged+tiled'.
[...]
main():
for (c2=0;c2&lt;=1535;c2++) {
for (c4=0;c4&lt;=1535;c4++) {
Stmt_4(c2,c4);
}
}
for (c2=0;c2&lt;=1535;c2+=64) {
for (c3=0;c3&lt;=1535;c3+=64) {
for (c4=0;c4&lt;=1535;c4+=64) {
for (c5=c2;c5&lt;=c2+63;c5++) {
for (c6=c4;c6&lt;=c4+63;c6++) {
for (c7=c3;c7&lt;=c3+63;c7++) {
Stmt_6(c5,c7,c6);
}
}
}
}
}
}
[...]
</pre>
<h5>Interchange + Tiling + Strip-mining to prepare vectorization</h5>
To later allow vectorization we create a so called trivially parallelizable
loop. It is innermost, parallel and has only four iterations. It can be
replaced by 4-element SIMD instructions.
<pre class="code">
opt matmul.preopt.ll -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-polly-cloog -analyze </pre>
<pre>
[...]
Reading JScop '%1 =&gt; %17' in function 'main' from './main___%1---%17.jscop.interchanged+tiled+vector'.
[...]
main():
for (c2=0;c2&lt;=1535;c2++) {
for (c4=0;c4&lt;=1535;c4++) {
Stmt_4(c2,c4);
}
}
for (c2=0;c2&lt;=1535;c2+=64) {
for (c3=0;c3&lt;=1535;c3+=64) {
for (c4=0;c4&lt;=1535;c4+=64) {
for (c5=c2;c5&lt;=c2+63;c5++) {
for (c6=c4;c6&lt;=c4+63;c6++) {
for (c7=c3;c7&lt;=c3+63;c7+=4) {
for (c8=c7;c8&lt;=c7+3;c8++) {
Stmt_6(c5,c8,c6);
}
}
}
}
}
}
}
[...]
</pre>
</li>
<li><h4>Codegenerate the SCoPs</h4> <li><h4>Codegenerate the SCoPs</h4>
<p>
This generates new code for the SCoPs detected by polly. This generates new code for the SCoPs detected by polly.
If -polly-import is present, transformations specified in the imported openscop If -polly-import is present, transformations specified in the imported openscop
files will be applied. files will be applied.</p>
<pre class="code">opt -basicaa -polly-import -polly-import-postfix=.opt -polly-codegen matmul.preopt.ll | opt -O3 &gt; matmul.pollyopt.ll</pre> <pre class="code">opt matmul.preopt.ll | opt -O3 &gt; matmul.normalopt.ll</pre>
<pre class="code">
opt -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged \
-polly-codegen matmul.preopt.ll \
| opt -O3 &gt; matmul.polly.interchanged.ll</pre>
<pre> <pre>
Cannot open file: ./init_array___%for.cond---%for.end28.scop.opt Reading JScop '%1 =&gt; %19' in function 'init_array' from
Skipping import. './init_array___%1---%19.jscop.interchanged'.
Reading SCoP 'for.cond =&gt; for.end48' in function 'main' from './main___%for.cond---%for.end48.scop.opt'.</pre> File could not be read: No such file or directory
Reading JScop '%1 =&gt; %17' in function 'main' from
<pre class="code">opt matmul.preopt.ll | opt -O3 &gt; matmul.normalopt.ll</pre></li> './main___%1---%17.jscop.interchanged'.
</pre>
<pre class="code">
opt -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \
-polly-codegen matmul.preopt.ll \
| opt -O3 &gt; matmul.polly.interchanged+tiled.ll</pre>
<pre>
Reading JScop '%1 =&gt; %19' in function 'init_array' from
'./init_array___%1---%19.jscop.interchanged+tiled'.
File could not be read: No such file or directory
Reading JScop '%1 =&gt; %17' in function 'main' from
'./main___%1---%17.jscop.interchanged+tiled'.
</pre>
<pre class="code">
opt -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-polly-codegen -enable-polly-vector matmul.preopt.ll \
| opt -O3 &gt; matmul.polly.interchanged+tiled+vector.ll</pre>
<pre>
Reading JScop '%1 =&gt; %19' in function 'init_array' from
'./init_array___%1---%19.jscop.interchanged+tiled+vector'.
File could not be read: No such file or directory
Reading JScop '%1 =&gt; %17' in function 'main' from
'./main___%1---%17.jscop.interchanged+tiled+vector'.
</pre>
<pre class="code">
opt -basicaa \
-polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-polly-codegen -enable-polly-vector -enable-polly-openmp matmul.preopt.ll \
| opt -O3 &gt; matmul.polly.interchanged+tiled+openmp.ll</pre>
<pre>
Reading JScop '%1 =&gt; %19' in function 'init_array' from
'./init_array___%1---%19.jscop.interchanged+tiled+vector'.
File could not be read: No such file or directory
Reading JScop '%1 =&gt; %17' in function 'main' from
'./main___%1---%17.jscop.interchanged+tiled+vector'.
</pre>
<li><h4>Create the executables</h4> <li><h4>Create the executables</h4>
@ -290,8 +433,7 @@ llc matmul.polly.interchanged+tiled.ll -o matmul.polly.interchanged+tiled.s &amp
llc matmul.polly.interchanged+tiled+vector.ll -o matmul.polly.interchanged+tiled+vector.s &amp;&amp; \ llc matmul.polly.interchanged+tiled+vector.ll -o matmul.polly.interchanged+tiled+vector.s &amp;&amp; \
gcc matmul.polly.interchanged+tiled+vector.s -o matmul.polly.interchanged+tiled+vector.exe gcc matmul.polly.interchanged+tiled+vector.s -o matmul.polly.interchanged+tiled+vector.exe
llc matmul.polly.interchanged+tiled+vector+openmp.ll -o matmul.polly.interchanged+tiled+vector+openmp.s &amp;&amp; \ llc matmul.polly.interchanged+tiled+vector+openmp.ll -o matmul.polly.interchanged+tiled+vector+openmp.s &amp;&amp; \
gcc -lgomp matmul.polly.interchanged+tiled+vector+openmp.s -o matmul.polly.interchanged+tiled+vector+openmp.exe gcc -lgomp matmul.polly.interchanged+tiled+vector+openmp.s -o matmul.polly.interchanged+tiled+vector+openmp.exe </pre>
</pre>
<li><h4>Compare the runtime of the executables</h4> <li><h4>Compare the runtime of the executables</h4>

View File

@ -1,11 +1,10 @@
#!/bin/sh -a #!/bin/sh -a
echo "--> 1. Create LLVM-IR from C" echo "--> 1. Create LLVM-IR from C"
clang -S -emit-llvm matmul.c -o matmul.s clang -S -emit-llvm matmul.c -o matmul.s
echo "--> 2. Load Polly automatically when calling the 'opt' tool" echo "--> 2. Load Polly automatically when calling the 'opt' tool"
export PATH_TO_POLLY_LIB="~/Projekte/polly/build_clang/lib/" export PATH_TO_POLLY_LIB="~/polly/build/lib/"
alias opt="opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so" alias opt="opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so"
echo "--> 3. Prepare the LLVM-IR for Polly" echo "--> 3. Prepare the LLVM-IR for Polly"
@ -40,10 +39,13 @@ echo "--> 8. Export jscop files"
opt -basicaa -polly-export-jscop matmul.preopt.ll opt -basicaa -polly-export-jscop matmul.preopt.ll
echo "--> 9. Import the updated jscop files and print the new SCoPs. (optional)" echo "--> 9. Import the updated jscop files and print the new SCoPs. (optional)"
opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll
opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll \ opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll \
-polly-import-jscop-postfix=interchanged -polly-import-jscop-postfix=interchanged
opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll \ opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll \
-polly-import-jscop-postfix=interchanged+tiled -polly-import-jscop-postfix=interchanged+tiled
opt -basicaa -polly-import-jscop -polly-cloog -analyze matmul.preopt.ll \
-polly-import-jscop-postfix=interchanged+tiled+vector
echo "--> 10. Codegenerate the SCoPs" echo "--> 10. Codegenerate the SCoPs"
opt -basicaa -polly-import-jscop -polly-import-jscop-postfix=interchanged \ opt -basicaa -polly-import-jscop -polly-import-jscop-postfix=interchanged \

View File

@ -11,6 +11,7 @@
position:absolute; position:absolute;
left:29ex; left:29ex;
padding-right:4ex; padding-right:4ex;
max-width: 50em;
} }
/**************/ /**************/

View File

@ -8,7 +8,7 @@
<a href="index.html">About</a> <a href="index.html">About</a>
<a href="todo.html">Todo</a> <a href="todo.html">Todo</a>
<a href="passes.html">LLVM Passes</a> <a href="passes.html">LLVM Passes</a>
<!-- <a href="examples.html">Examples</a> --> <a href="examples.html">Examples</a>
<a href="performance.html">Performance</a> <a href="performance.html">Performance</a>
<a href="publications.html">Publications</a> <a href="publications.html">Publications</a>
<a href="contributors.html">Contributors</a> <a href="contributors.html">Contributors</a>