In a multi-issue superscalar, several branches could
appear in a single instruction packet
Can we predict the outcome of all of them as
soon as the packet is fetched?
In many cases yes, though branch prediction table
structure somewhat messy
Extend the idea: can we predict which trace control
will take through a sequence of branches
yes
suits both bimodal and correlated patterns
get branch target buffer to cache this trace as a
single-issue packet
this trace cache idea is becoming important
eg Pentium 4
But how is cache filled? What can we do in the fill unit?