PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

1Northeastern University, 2Zhejiang University, 3Adobe Research, 4The Pennsylvania State University

PlanarRecon can detect and reconstruct 3D planes from a posed monocualr video.

(Data is captured around the apartment with an iPhone XR, and the camera poses are obtained from ARKit. The model used here is only trained on ScanNet.)

Abstract

We present PlanarRecon – a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video. Unlike previous works that detect planes in 2D from a single image, PlanarRecon incrementally detects planes in 3D for each video fragment, which consists of a set of key frames, from a volumetric representation of the scene using neural networks. A learning-based tracking and fusion module is designed to merge planes from previous fragments to form a coherent global plane reconstruction. Such design allows PlanarRecon to integrate observations from multiple views within each fragment and temporal information across different ones, resulting in an accurate and coherent reconstruction of the scene abstraction with low-polygonal geometry. Experiments show that the proposed approach achieves state-of-the-art performances on the ScanNet dataset while being real-time.

Overview Video

Reconstruction showcase

Colored planes mean different instances. We also show the textured mesh for ScanNet0100_00.

AR Demo

BibTeX

@inproceedings{xie2022planarrecon,
  title={{PlanarRecon}: Real-Time {3D} Plane Detection and Reconstruction from Posed Monocular Videos},
  author={Xie, Yiming and Gadelha, Matheus and Yang, Fengting and Zhou, Xiaowei and Jiang, Huaizu},
  booktitle={CVPR},
  year={2022}
}