The major philosophy in designing the on-chip memory system was: achieve
uniform performance scaling by scaling both bandwidth and latency. The
reason for scaling both bandwidth and latency was that some programs,
due to inherent sequential dependence of data, do not have very good
ILP and hence, just scaling bandwidth would not help.
Memory scaling issues are three prong: reducing average latency, reducing
main memory latency, and scaling on-chip bandwidth.