Caches are essential to gain the maximum performance
from modern
microprocessors
The performance of a cache is close to that of SRAM
but at the cost of DRAM
Caches can be used to form the basis of a parallel
computer
Bus-based multiprocessors do not scale well: max
< 10 nodes
Larger-scale shared-memory multiprocessors require
more complicated networks
and protocols
CC-NUMA is becoming popular since systems
can be built from commodity components (chips, boards, OSs) and use existing software
e.g. HP/Convex, Sequent, Data General, SGI, Sun, IBM