Increasing the
complexity of a single CPU leads to diminishing returns
Due to lack of
instruction-level parallelism
Too many simultaneous
accesses to one register file
Forwarding wires
between functional units too long - inter-cluster communication
takes >1 cycle