Live Dense Reconstruction with a Single Moving Camera
Richard A. Newcombe and Andrew J. Davison
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010)
Abstract
In this work we present a method which enables rapid and dense reconstruction of scenes browsed by a single live camera. We take point-based real-time structure from motion (SFM) as
our starting point, generating accurate 3D camera pose estimates and a sparse point cloud. Our main novel contribution is to use an approximate but smooth base mesh
generated from the SFM to predict the view at a bundle of
poses around automatically selected reference frames spanning the scene, and then warp the base mesh into highly
accurate depth maps based on view-predictive optical flow
and a constrained scene flow update. The quality of the resulting depth maps means that a convincing global scene
model can be obtained simply by placing them side by side
and removing overlapping regions. We show that a cluttered indoor environment can be reconstructed from a live
hand-held camera in a few seconds, with all processing performed by current desktop hardware. Real-time monocular dense reconstruction opens up many application areas,
and we demonstrate both real-time novel view synthesis and
advanced augmented reality where augmentations interact
physically with the 3D scene and are correctly clipped by
occlusions.
[PDF]
An older video of the reconstruction process and physics augmented VR using the reconstruction. Note this video is currently being updated!
[AVI]
Examples
Top: Overview of the desk scene. Bottom: reconstruction from 8 camera bundles of 4 images each.
Left: Texture maped model of the 3 patch reconstruction. Right: Shaded mesh.
Bottom Left: Rendering the resulting vertex buffer for the shoe scene. Right: Shaded mesh.
Top: screen shot from the live running PTAM thread showing the sparse SFM point cloud projected into the current frame. The result of a 4 image local reconstruction using the constrained scene flow update, error map (left), shaded mesh using surface normals (right).
Top: Overview image. Bottom Left: 4 patch local reconstructions stiched into the global frame. Right: the resulting normal map.