Next:
Reducing overheads of blocking...
Up:
Motivation: an example
Previous:
...
Performance of blocked version
Blocking factor
Execn time
MFLOPS
8
3.815
70.4
16
2.784
96.4
32
2.283
117.6
40
2.193
122.4
48
2.253
119.1
56
2.473
108.5
64
3.404
78.9
72
5.608
47.9
80
5.578
48.1
88
5.808
46.2
96
5.928
45.3
104
6.309
42.5
112
5.778
46.5