Software Reliability 2013/2014
MEng 4 and MSc Software Engineering, Department of Computing
Imperial College London
©

NEWS

19 December 2013

Software Reliability 2014 web site goes live.

Software Reliability (440)

Open to MEng 4 students and MSc in Computing (Software Engineering) students, this course provides an overview of exciting recent research into techniques and tools which aim to help developers improve the reliability of their software. The course may appeal especially to students who are interested in learning research skills.

The importance of software reliability
Society is becoming ever more reliant on software and software-controlled systems. Some of this software is safety-critical, e.g., the software used to control cars, aeroplanes and other high-speed transport. Defects in safety-critical software can lead to serious injury or death. A much larger volume of software is business-critical, e.g., software that runs in mobile phones, powers web servers and manages data centers. Defects in this type of software can lead to significant financial losses. Underpinning all of these areas is systems software: the low-level operating systems, device drivers and networking software on which complex systems are built. This foundational role means that the reliability of systems software is of primary importance.
Traditional methods for improving software reliability

Three main techniques are used in industrial and open source projects to improve software reliability:

  • Manual testing: Manually crafting a suite of tests to exercise a software system to a reasonably high degree, e.g., covering a high percentage of program statements.
  • Code reviews: Requiring that source code additions or modifications (patches) are reviewed by experienced developers before being committed to the code base.
  • Coding standards: Requiring that all developers adhere to a set of rules when writing or maintaining code. Coding standards can improve source code readability, making it easier to spot defects, and may ban the use of programming idioms that are arguably dangerous.

Rigorous manual testing, code reviews and adherence to standards are essential to the success of large software projects, but they all suffer from two common problems:

  • They depend fundamentally on human reasoning and judgement. Humans are clever, but software can be devilishly complex. It is very easy for subtle defects to creep into a project despite adherence to coding standards, and to evade manual testing and code review.
  • They do not provide guarantees. A test suite can demonstrate that certain executions of a software system do not exhibit defects, but provides no further guarantee. For safety- (and often business-) critical systems this is not enough: it is highly desirable to have a guarantee of defect freedom; ideally an absolute guarantee, but at least a guarantee that system executions have been systematically checked up to some well-defined bound.

This course
The focus of this course is on automatic techniques for improving software reliability which go beyond manual testing. The course will cover:

  • Static program verification: Attempting to prove, through the use of invariants that a piece of software is free from defects.
  • Extended static checking: A relaxed form of static program which may miss certain program defects, but which consequently is easier to automate.
  • Dynamic discovery of likely invariants: Exploiting a test suite to discover program invariants which (among other things) can be used to aid static program verification or extended static checking.
  • Dynamic symbolic execution: Exploring execution paths of a program with respect to arbitrary program inputs. This is extremely effective in discovering edge case inputs that have been missed by human, and which have a vanishingly small probability of being discovered through random testing.
  • Software model checking: Exploring the reachable states of a piece of software systematically, to discover, or show absence of, bugs. Exploration may be exhaustive, providing a guarantee of defect-freedom, or bounded, providing this guarantee only with respect to an execution bound.

A common property of all these techniques is that they are based on logic: software defects are discovered by encoding program behaviours as logical formula, which are processed by a constraint solvers. Dramatic advances in the power of constraint solvers in recent years has made it possible to apply the above techniques to larger and larger software systems.

The course will also concentrate on a particular practical application of some of these techniques:

  • Automatic defect analysis for GPU kernels: Methods for improving the reliability of parallel programs that have been accelerated to run on graphics processing units.