Imperial's Software Performance Engineering Laboratory
the software performance optimization research group
Dr Tony Field, Department of Computing
Professor Wayne Luk, Department of Computing (Computer Systems Section, Custom Computing Group)
Professor Chris Hankin, , Department of Computing
Professor Chris Pain, Department of Earth Science & Engineering
Dr Gerard Gorman, Department of Earth Science & Engineering
Lee Howes (PhD student)
Michael Mellor (PhD student)
Francis Russell (PhD student)
We also have a continuous stream of student members of the research group. A partial list includes:
MEng/BEng projects 2006-7: Peter Collingbourne, James Huggett, Alex LaMaison, Vishal Mistry.
MSc 2006-7: Steve Lilly, Tadhg Pearson, David Olivier
MEng/BEng projects 2005-6: Numaan Chaudry, Henry Falconer, Andrew Huff, Ekaterina Itskova, John Pappin
UROPs 2005-6: Hubert Plociniczak, David Ingram
Christian Lengauer (
Scott Baden (Associate Professor,
Alumni and PhD Graduates: see here
Performance Optimization research group at
Our research objective is to make it easier for software engineers to achieve adequate performance. Software performance really matters:
In computational science and engineering, users need to assemble simulation, analysis, visualisation, data repository and optimization components quickly and without compromising responsiveness. They need their software to combine sophisticated data structures and numerical methods with architecture-specific program optimisations. They need tools to scale-up functional prototypes to handle full-size problems, and they need programs to adapt dynamically to the processing, networking and other resources available at run-time.
In embedded systems and mobile appliances, engineers need to derive the optimum hardware-software configuration to achieve the application's real-time demands while minimising power requirements and retaining flexibility. They need to generate code for multiple system-on-a-chip variants, consisting, for example, of processor core, signal processor and graphics device, from a common code base. They need to derive a custom system-on-a-chip configuration from the program structure. Standard libraries need to be automatically tuned for each custom CPU configuration. Dynamically-loaded code needs to exploit the resources of the specific device on which it runs.
In business information systems,
software architects need to design secure, distributed systems by building and
assembling re-usable components. Team leaders need to be able to rely on
software services with minimal consideration for how they are implemented and
distributed. Tools are needed to diagnose and rectify performance problems by
automatically aggregating communications, caching data, proxying
access methods and combining updates. Such tools involve optimisations
which are not just inter-procedural, but which span a distributed system and
cross security domains.
Our approach: run-time optimization of dynamically-composed software
The challenge we have identified is to extend existing compiler optimisations to the component level. Inter-component optimization goes beyond classical inter-procedural optimization in two crucial respects:
Component metadata is similar to the interprocedural summary information modern compilers associate with separately-analysed procedures. This covers the analysis side of the picture; component metadata also describes the performance characteristics of the available component implementations, together with the parameters which can be adjusted to match the implementation to the context in which it is being used.
1. Develop a software infrastructure based on Java and remote method invocation (RMI) which makes it possible to apply compiler code motion optimizations to distributed applications
2. Extend current understanding of code motion optimizations and the underlying static analyses, in order to provide a principled basis for optimising distributed applications without compromising security or availability
3. Develop software tools for building distributed scientific applications which adapt, at run-time, to the problem, the available resources, and to the available precursors cached from earlier computation
4. Build a theoretical framework for run-time optimization of irregular scientific applications, which balances the potential advantage of having full knowledge of how future execution will develop, against the time and space needed to compute and maintain this information
5. Develop our understanding of the search space for optimally matching the execution strategy for a computational kernel to the available hardware, and extend it to larger-scale applications
projects listed above there are some pervasive technical problems which are not
likely to be solved soon:
Dependence analysis for rich data structures
Smart programs use rich data structures, both to be flexible, and to use sophisticated algorithms. Unfortunately, rich data structures are hard to analyse. Pointers lead to the possibility of two objects having shared substructure. This blocks many of our optimisation techniques. Sergio Almeida, a PhD graduate from the group, has an interesting proposal; see his ECOOP97 paper (available from his home page http://shiva.di.uminho.pt/~psa/).
Reasoning about performance issues with rich data structures
It is usually very difficult to reason about performance issues with rich, linked data structures. How large would the serialised version of this parameter be? How much work is involved in this subtree? This is analogous to a static analysis problem, but to solve it we need to add code to maintain the information we want as the computation proceeds. We need the compiler to generate code to provide the information we will need to make good run-time optimization decisions.
Programming languages and libraries
We need to get people to use much better programming languages. It's unlikely that the problems with rich data structures are going to be solved by better analysis - the best bet is to get programmers to tell us what we need to know. New language concepts may be necessary. Perhaps it's possible to do it with metadata for container class libraries?
Linking static analysis to dynamic program properties
Conventional optimizing compilers separate analysis from synthesis: find out what you can about the program, and use it to generate the best possible code. In the run-time context, we need to turn this round: analyse the code to find out what you need to know about the run-time context to achieve optimisations, then generate code (or construct component metadata) to acquire this information and use it.
Program optimization as mathematical optimisation: understanding the search space
Although a run-time optimizer has to run fast initially, a long-running program can be subjected to very intensive optimization. We can therefore consider searching for the best possible code, for example by backtracking (eg when loop unrolling leads to register spills). A more systematic approach is to formulate the optimisation problem in mathematical terms, and use ideas from operational research to characterise and partition the search space.