Advanced Computer Architecture, Imperial College 2001
24
Large-Scale Shared-Memory Multiprocessors
•Bus inevitably becomes a bottleneck when many processors are used
–Use a more general interconnection network
–So snooping does not work
• DRAM memory is also distributed
– Each node allocates space from local DRAM
– Copies of remote data are made in cache
•Major design issues:
–How to find and represent the “directory" of each line?
–How to find a copy of a line?
•As a case study, we will look at S3.MP (Sun's Scalable Shared memory Multi-Processor, a CC-NUMA (cache-coherent non-uniform memory access) architecture