Department of  Computing

Applications of Computing in Industry : Lecture

28 February
Noon, LT308 Huxley
 
company: Oracle

Title: Debugging large scale operating systems
Abstract:

Operating Systems are a medium though which hardware resources are managed and exported as services to applications. When they stop working, so does the business on which the tower of software and hardware is supporting. Finding the root cause of a system failure in an Operating System built from a code base of millions of lines, possibly with 3rd party kernel modules is a seriously challenging task. When the systems in question have 1000's of cpu's, 100's of TB's of physical memory and running 100,000's of processes, then complexity steps up a gear. We examine some of the practical issues involved in post-postmortem failure analysis and what the future challenges are in scaling up diagnosis support to deal with the large systems of tomorrow.

Speaker Details:

Jim Moore


 

Jim Moore is the EMEA Solaris Revenue Product Engineering Director where he is responsible for Solaris engineering staff in the UK and Prague. A role which attempts to reconcile the holy trinity of product quality, customer satisfaction and making R.P.E. a enjoyable place to work. Previously, Jim was a senior kernel engineer for many years working across the spectrum of Solaris subsystems.


Social Bookmarking:
Delicious
Digg