A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

Motivation

The fidelity in which the optimization model captures the underlying application can often be controlled. Typical examples include the discretization of Partial Differential Equations in computer vision and optimal control, the number of features in machine learning applications, the number of states in a Markov Decision Processes, and so on. In many areas it is common to take advantage of this structure by solving a low fidelity (coarse) model and then use the solution as a starting point in the high fidelity (fine) model.

Basic Idea

We adopt an optimization point of view and show how to take advantage of the availability of a hierarchy of models in a consistent manner for composite convex optimization. We do not use the coarse model just for the computation of promising starting points but also for the computation of search directions. The algorithm we propose is similar to the Iterative Shrinkage Thresholding Algorithm (ISTA) class of algorithms. The algorithm we develop combines elements from ISTA (gradient proximal steps) and the multigrid framework (coarse correction steps). We call the proposed algorithm Multilevel Iterative Shrinkage Threshold Algorithm (MISTA).

Theoretical Results

We show that the coarse correction step satisfies the contraction property as long as the objective function is convex and the differentiable part has a Lipschitz continuous gradient. If the differentiable part is strongly convex, then MISTA has a Q-linear convergence rate. On the other hand, if the differentiable part is only convex and has Lipschitz continuous gradients, then MISTA has an R-linear convergence rate.

Numerical Experiments

We compared our algorithm (MISTA) to two widely used algorithms, ISTA and FISTA on an image restoration problem.

The aim of image restoration is to take a noisy and blurred image (like the image of our Computational Optimisation Group on the left), and recover the original image. A restored image is shown on the right. We applied the algorithms on several large images (with resolution 1024x1024). The fine model has more than 1 million variables. We then used lower resolution models and MISTA to speed up the computations.

In the left figure below we compare the three algorithms in terms of function value progress. The results shown are for a single image. MISTA clearly outperforms the other algorithms and converges in essentially 5 iterations. MISTA performs more expensive iterations so to be fair we need to compare the algorithms in terms of CPU time. In the right figure below we show the CPU time required to find a solution within 2% of the optimum for the three algorithms when the images have 1% noise. Higher levels of noise lead to more ill conditioned problems. MISTA is on average ten times faster than ISTA and three/four times than FISTA.



The code and images used in the experiments above are available here.

Reference

@unpublished{mista2014,
Title = {A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization},
Author = {P. Parpas and D.V.N. Luong and D. Rueckert and B. Rustem},
Note = {Submitted},
Year = {March, 2014}}
Jointly with Duy V. N. Luong, Daniel Rueckert, Berc Rustem
Download the paper and associated code.