Next: Have we reached the limit?
Up: Chapter1
Previous: Software pipelining
We get
ldd [%o0],%f8 ! load f8 for first itern
L313:
fmuld %f6,%f8,%f2
ldd [%o1],%f4
ldd [%o0+8],%f8 ! load f8 for next itern
add %o1,8,%o1
faddd %f2,%f4,%f2
add %o0,8,%o0
cmp %o0,%o2
blu L313
std %f2,[%o1-8]
- Note that work is being done concurrently on both iteration
j and on iteration j+1.
- This idea is called ``software pipelining''. It is possible
to spread a loop over several iterations -- with ever more
complicated run-up and run-down code.
- 33.2 seconds (7.53 MFLOPS).
Paul H J Kelly
Thu Feb 6 22:02:49 GMT 1997