Live Dense Reconstruction with a Single Moving Camera
Richard A. Newcombe and Andrew J. Davison
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010)
In this work we present a method which enables rapid and dense reconstruction of scenes browsed by a single live camera. We take point-based real-time structure from motion (SFM) as
our starting point, generating accurate 3D camera pose estimates and a sparse point cloud. Our main novel contribution is to use an approximate but smooth base mesh
generated from the SFM to predict the view at a bundle of
poses around automatically selected reference frames spanning the scene, and then warp the base mesh into highly
accurate depth maps based on view-predictive optical flow
and a constrained scene flow update. The quality of the resulting depth maps means that a convincing global scene
model can be obtained simply by placing them side by side
and removing overlapping regions. We show that a cluttered indoor environment can be reconstructed from a live
hand-held camera in a few seconds, with all processing performed by current desktop hardware. Real-time monocular dense reconstruction opens up many application areas,
and we demonstrate both real-time novel view synthesis and
advanced augmented reality where augmentations interact
physically with the 3D scene and are correctly clipped by
An older video of the reconstruction process and physics augmented VR using the reconstruction. Note this video is currently being updated!
Top: Overview of the desk scene. Bottom: reconstruction from 8 camera bundles of 4 images each.
Left: Texture maped model of the 3 patch reconstruction. Right: Shaded mesh.
Bottom Left: Rendering the resulting vertex buffer for the shoe scene. Right: Shaded mesh.
Top: screen shot from the live running PTAM thread showing the sparse SFM point cloud projected into the current frame. The result of a 4 image local reconstruction using the constrained scene flow update, error map (left), shaded mesh using surface normals (right).
Top: Overview image. Bottom Left: 4 patch local reconstructions stiched into the global frame. Right: the resulting normal map.