Refer to the the article you have been given entitled
- The Microarchitecture of the Pentium 4 Processor by Hinton, Sager et al (Intel Technology Journal, Q1 2001, henceforth ``MPP'').
- 1.
- Is the Pentium 4 suitable for use in a laptop computer?
- 2.
- What is a uop?
- 3.
- The Pentium 4 can execute instructions faster than it can
decode them. How?
- 4.
- What does ``Branch History Update'' mean in Figure 1?
- 5.
- Figure 4 shows seven functional units.
What is the maximum possible number of functional units can,
in principle, be active simultaneously in one cycle? See page 7.
- 6.
- The front-end bandwidth and retirement bandwidth is,
apparently, 3 uops per clock. How could more than 3 uops be
executed in the same cycle?
- 7.
- What is a RAT? (see Figure 5 and page 6). When is the RAT used?
- 8.
- (Page 7). ROB entries track uop status and are allocated and
deallocated sequentially. What happens when a uop is retired?
- 9.
- Consider the IA-32 instruction:
addl 12(%ebp),%edx
This takes the register ebp, adds 12 to it and uses this
address to fetch its first operand, which it then adds to the
register edx. Presumably, this instruction is translated into
two uops - a load and an add. The load uop (and its address
calculation) is executed by the
Load AGU (``address generating unit'') shown in Figure 4.
How do you think the value read from location (ebp+12) is
passed from the load uop to the add uop?
- 10.
- IA32 instructions must be executed atomically from the point
of view of external interrupts (though not from the point of view of
other CPUs or I/O devices). How do you think the Pentium 4
ensures that external interrupts are serviced at appropriate
points?
- 11.
- What questions have been omitted from this list because they
appear in the exam?
Paul Kelly, Imperial College, November 2001