332 Advanced Computer Architecture

Tutorial exercise 8

Refer to the the article you have been given entitled
1.
Is the Pentium 4 suitable for use in a laptop computer?
2.
What is a uop?
3.
The Pentium 4 can execute instructions faster than it can decode them. How?
4.
What does ``Branch History Update'' mean in Figure 1?
5.
Figure 4 shows seven functional units. What is the maximum possible number of functional units can, in principle, be active simultaneously in one cycle? See page 7.
6.
The front-end bandwidth and retirement bandwidth is, apparently, 3 uops per clock. How could more than 3 uops be executed in the same cycle?
7.
What is a RAT? (see Figure 5 and page 6). When is the RAT used?
8.
(Page 7). ROB entries track uop status and are allocated and deallocated sequentially. What happens when a uop is retired?
9.
Consider the IA-32 instruction:

addl 12(%ebp),%edx
This takes the register ebp, adds 12 to it and uses this address to fetch its first operand, which it then adds to the register edx. Presumably, this instruction is translated into two uops - a load and an add. The load uop (and its address calculation) is executed by the Load AGU (``address generating unit'') shown in Figure 4. How do you think the value read from location (ebp+12) is passed from the load uop to the add uop?
10.
IA32 instructions must be executed atomically from the point of view of external interrupts (though not from the point of view of other CPUs or I/O devices). How do you think the Pentium 4 ensures that external interrupts are serviced at appropriate points?
11.
What questions have been omitted from this list because they appear in the exam?


Paul Kelly, Imperial College, November 2001


next up previous