Simulating a PC running Linux
Idea:
Write a program, which runs under a standard PC operating system,
which simulates the execution of a complete, bare PC and its
peripherals. Your simulation should be good enough to be able to run
a Unix-like operating system (such as Linux, or perhaps Windows,
Solaris or Plan 9).
Motivation:
There are several reasons why it would be useful to be able to run a complete
operating system as a process within a standard (presumably Unix)
environment:
- One reason is, of course, so that one can experiment
with non-standard operating systems without having to re-configure
one's machine.
- Another reason (the main reason from the point of view of my
research) is that one can instrument the
simulation and study how the simulated system spends its time.
Design problems/issues:
- You need to be able to simulate the effect of privileged
instructions, but the simulator will be running in user mode.
A simple way to solve this is to use an instruction set
simulator, instead of directly executing the code of the
client operating system or its processes.
A related problem is that the client OS will manipulate
its address translation hardware, which is not easily done
in the simulator.
- Using an instruction set simulator would mean that your system
would be portable to non-Intel platforms, but would be slow.
You may be able to improve the speed by reverting to direct
execution in certain circumstances (e.g. when executing the
the client OS's subprocesses), although this is potentially
tricky (e.g. a trap will enter the host's handler, not the
client OS's).
Another complicated solution is to translate the Intel binary
into the host machine's instruction set (which may be Intel again,
but without privileged instructions).
- Another advantage of using an instruction set simulator is that
you can monitor how many instructions have been executed, how much
time is spent computing, etc. You can also simulate the cache
and monitor its effectiveness, and you can simulate the address
translation mechanism (TLB etc).
- You should expect the simulated OS to run more slowly. This
is to be expected, but it's worth investigating checkpoints to
avoid having to simulated the bootstrap sequence repeatedly.
The idea of checkpointing is to take a complete copy of the
state of the simulator, so that it can be restarted from where
it left off.
- You need to build simulations of peripherals as well as
the CPU. You can, of course, choose the simplest peripherals
you can find - perhaps just a serial terminal line and a disk
drive to start with. Graphics and network devices can come
later. Note that the disk device can get the actual data
from the host filesystem.
It is desireable to simulate real devices, so that you can
install any client OS which can drive them.
- For some applications, it is important to estimate the time
the client system would take if it were running on real
hardware of some configuration. There are many interesting
research projects which would be made possible if this were
available.
Examples include
- Data caches; most research into the design of caches for
high-performance microprocessors have assumed a single,
CPU-intensive application (cf. the SPEC benchmark suite).
This doesn't predict the performance of the system when
running a mixture of jobs, or an application which involves
interaction with the OS and with other processes.
- Instruction caches; operating systems have unusual instruction
cache behaviour compared with typical CPU-intensive
applications, and the instruction cache design options need to
be re-evaluated (e.g. because interrupts and especially system
calls can lead to many instructions being executed with
very little re-use of recently-executed instructions.
This trashes the background application's cache to no purpose).
- Network interfaces; traditionally, networks have been slow
compared with CPUs and memory systems. Now,
commonly-available LAN technology pushes the limits of the
CPU's responsiveness and its memory system's bandwidth.
It's vital to eliminate software overheads from network
interfacing, but it's unclear how to integrate this into
existing protocol stacks.
- Non-volatile RAM, parallel disk systems, etc;
There are many ways of improving the performance of file
access while retaining data stability, but it's unclear
which approach is suitable in which situation.
- Power management is a neglected but crucial issue in many
interesting application areas. It is primarily an operating
system function, but its impact on the system's perceived
responsiveness must be controlled.
-
The job:
- Decide how Intel instructions are to be executed in the client OS
kernel. Unless you can think of a really clever scheme, this
is going to have to be done in a software simulator. Implement
one and test it carefully.
- Figure out the bootstrap file format and fix your simulator
to work with it. Test by compiling a simple standalone
application and running it on the real hardware and on the
simulator (this requires some trivial form of output mechanism).
- Develop a simulation of the serial line controller, and test it
against the real hardware. When this works you should be able to
load a Linux kernel and get an error message when it fails to
contact any other peripherals. You should be able to test it
fully by modifying the Linux kernel.
- Identify the other peripherals that need to work, such as timers
etc, and develop simulations for them. Test similarly. The most
complex will probably be the disk drive (you may wish to start
with a floppy drive, since you ought to be able to map the basic
operations to raw I/O on the host OS and access a standard
boot disk).
- At this point, you ought to be able to run a Linux kernel,
although it is likely that problems will be uncovered
with the simulation of interrupt, trap and address translation
mechanisms. Once you fix them, the system should actually boot,
albeit slowly.
- Once the system works, you will want to spend some time sorting
out more devices
(e.g. a hard disk importing files from the host), perhaps
a network, checkpointing, and some monitoring.
- As soon as the system is stable, it would be really nice to use
it to collect some statistics as a start towards using it
as a platform for experimental research.
- Also, you oughy to be able to boot some other OS's. Windows,
Solaris, BSD, etc - if you get the basics right they should
all just run.
Reading:
Check out the SimOS
project; they describe the results from a similar project for
SGI hardware.
Equipment:
PC running Linux; stacks of disk space.
Recommended tools:
Unix, C/C++.
Suitability:
This is a demanding research-level project with enormous potential
scope. The basic prerequisites are 1) insight into performance, architecture and
applications issues, 2) the practical ability to get complicated software to do what you
want, and the imagination and clarity of thought to design good experiments.