Wysiwyg LaTeX using Checkpointing

A project to build a What-You-See-Is-What-You-Get phototypesetting system based on the LaTeX package

LaTeX is a document-preparation and typesetting package which is widely used in academia. It produces exceptionally nice-looking results (especially for maths), is extremely reliable, and can handle full-size books as well as letters and articles.

Unfortunately it is a little inconvenient to use: you prepare a "source" file in ASCII which describes the structure of your document, then "compile" it to produce an output file which you can view or print.

This means there is a delay between typing the material in, and seeing the final result. For short documents this is annoying but quick. For a book it can take frustratingly long.

The idea of this project is to build a clever front-end which allows you to view the page you're working on practically instantly.

How can this be done easily?

The goal of this project is to reuse the LaTeX program more-or-less unchanged. This has the advantage the you inherit the reliability and quality of the existing LaTeX code.

The trick is to use checkpointing so that the LaTeX program doesn't have to rerun from the beginning each time you want to view the output.

That is:

When you run LaTeX, stop every few pages and take a checkpoint of the program's state.
When you edit the source code, make a note of the earliest byte of the input file that is changed.
Find the latest checkpoint from the previous run, just before this changed byte was read.
Rerun the program from there. (you may run it to completion, or stop it as soon as enough output has been generated).
Signal the viewer that it should reread the output file to display the current page with the changes.

The effect of this is that you can view the effect on the current page of a change to the input file with a limited amount of work, no matter how large the input file.

How can this be done efficiently?

The feasibility of the scheme depends on collecting checkpoints lazily. Fortunately there are well-established techniques for checkpointing computations with minimal cost using virtual memory techniques. The basic idea is that when a checkpointing event occurs, the current data pages are protected against writing, and their identifiers are remembered in case the snapshot has to be reinstated. The client computation is then continued, but when a page fault occurs due to a write to a checkpointed page, a true copy of the page is made. The client process's memory map is modified so that the copy supercedes the read-only copy, and the client is allowed to continue with read-write access to the page.

When the next checkpoint event occurs, many but not all of the read-only pages may have been copied. As before, the present pages are made read-only and their idenifiers are remembered for restarting.

The effect of this scheme is that time is spent copying only those pages which are actually changed between checkpoints. Furthermore, space is occupied only by those pages which are changed.

A complete system

I am sure that many user-interface designs are possible, but the basic service we can offer is as follows:

A pair of windows, side-by-side: one a text editor for modifying the LaTeX source of the document, the other displaying the result.
A button (mouse, keyboard or on-screen) to cause reformatting
An optional pop-up window showing the messages arising from any formatting errors.

A nice touch would be synchronised scrolling of the two windows --- so when you move around in the previewer, the text editor is adjusted to refer to the right area. This is easily done using the correlation between pages of the DVI file read by the previewer, pages output by TeX, and pages read by TeX.

It is probably necessary to perform recomputation only when some specific command is given, as otherwise a syntax error is likely to occur, leading to no output.

Note that when starting the system up, snapshots need be collected relatively infrequently, perhaps every five or ten pages. We can thus keep the time invested in snapshotting small. Once recomputation is requested, more frequent snapshots should be kept to reduce latency.

Equipment: Unix workstation.

Recommended tools: Unix, C or C++, Tcl/TK, GNU Emacs, xdvi (sources), latex (sources), checkpointing package (e.g. as produced by project student Paul Walmsley in 1994).