KELLY 4: Conclusion

Next: ... Up: Introduction Previous: KELLY 4

Parallelism is being exploited at three levels: pipelining of arithmetic, multiple FPUs, and multiple processors: [3em]

Arithmetic pipelining FP pipeline depth $\times$ 5

Multiple FPUs Control complexity $\times$ 3

Multiple CPUs Interconnection $\times$ 16

Total parallelism $\times$ 240
The memory system parallelism is needed to sustain this performance on applications which don't fit into cache
For peak performance we must write programs which are optimal at all three levels - but sub-optimal programs can still run at a large fraction of peak.

The moral:

High performance computing relies on achieving the maximum cost-effective advantage at every level.