Large-Scale Shared-Memory Multiprocessors
Bus inevitably becomes a bottleneck when many
processors are used
– Use a more general interconnection network
– So snooping does not work
•  DRAM memory is also distributed
–  Each node allocates space from local DRAM
–  Copies of remote data are made in cache
• Major design issues:
– How to find and represent the “directory" of each line?
– How to find a copy of a line?
• As a case study, we will look at S3.MP (Sun's
Scalable Shared memory Multi-Processor, a CC-
NUMA (cache-coherent non-uniform memory access)
architecture