Formal Analysis Techniques for GPU Kernels
Tutorial at HiPEAC 2014, Berlin
21 January 2014


December 2013

FAT-GPU web site goes live.

July 2013

FAT-GPU is accepted as a half-day tutorial at HiPEAC 2014.


FAT-GPU 2014 is supported by the FP7 project CARP: Correct and Efficient Accelerator Programming

The tutorial is also supported by the HiPEAC Network of Excellence, through which the HiPEAC conference is organised.

FAT-GPU 2014


John Wickerson

Postdoctoral researcher at Imperial College London.


Formal Analysis Techniques for GPU Kernels (FAT-GPU) is a half-day tutorial on recent developments in the area of improving the reliability of GPU kernels. The focus will be the GPUVerify tool, developed at Imperial College London as part of the CARP EU project. The tutorial will cover how GPUVerify can be used for bug detection, proving absence of defects, and checking portability of GPU kernel code written in OpenCL and CUDA.

The tutorial will be practical, aimed at researchers and practitioners in the parallel programming community who have a reasonable background in GPU programming. The tutorial will not assume a background in formal verification or theoretical computer science.

This YouTube video on GPUVerify gives a flavour of the style of the tutorial (though the presenter at the tutorial is different, and will give his own take on the topic):


Programming models such as OpenCL, CUDA and C++ AMP allow massively parallel graphics processing units (GPUs) to be used for the acceleration of computations in diverse application areas, including financial analysis, medical imaging, media processing and physical simulation. In each of these models, the programmer writes a kernel which describes the computation to be performed by each GPU thread.

Writing correct GPU kernels is challenging: to achieve high performance, parallel threads must communicate using shared memory and synchronize using barriers. Errors in memory access and synchronization code can lead to unintentional data races and barrier divergence.

A related challenge is writing portable kernels which will work correctly across a range of devices. This is especially important for OpenCL, which is gaining ground as the programming model of choice for all kinds of accelerators, including a variety of manycore processors as well as FPGAs (e.g., Altera).

Kernel defects and portability problems can lead to nondeterministic behaviour which is difficult to reproduce and fix. Worse, it may be that some errors do not manifest on the GPU hardware with which a developer tests their code, and therefore go unnoticed until the code ships and is executed on a wide variety of alternative platforms.

There is a clear need for better techniques and tools to help GPU kernel developers write code which is free from concurrency errors, and there has been recent interest in the use of methods based on formal verification and symbolic execution for this purpose. This tutorial will provide an overview and demonstration of the state-of-the-art in formal analysis techniques for GPU kernels, including novel methods that are being developed as part of the FP7 project CARP: Correct and Efficient Accelerator Programming.

Presenter Bio

John Wickerson is a postdoctoral research associate working on how to reason rigorously about GPU programs. He is specifically interested in formalising the memory model defined by the OpenCL standard. In previous research, he has developed a diagrammatic proof system for separation logic, and has investigated the application of the rely/guarantee method to the verification of sequential modules.