Advanced Computer Architecture Chapter 7.
33
Large-Scale Shared-Memory
Multiprocessors
•
Bus inevitably becomes a bottleneck when many
processors are
used
Use a more general interconnection network
So
snooping does not work
DRAM memory is also distributed
Each node allocates space from local DRAM
Copies of remote data are made in cache
Major design issues:
How to
fi
nd and represent the
“
directory" of
each line?
How to
fi
nd a copy of a line?
As a case study, we will look at S3.MP (Sun's
Scalable Shared
memory Multi-Processor, a
CC-NUMA (cache-coherent non-
uniform memory
access) architecture