An overview of the proposed joint learning pipeline. Given a set of images, we first apply a Structure-from-Motion (SfM) algorithm to construct an initial scene graph (left), within which, each node represents a posed image. An edge between two nodes suggests that the involved images are to share overlapped regions. Next, the initial scene graph is sanctified. Each node is then assigned a confidence score based on the number of matching points among neighboring nodes. Then, we train a Neural Radiance Field (NeRF) using the confidence-aware scene graph and images. The training process alternates between fitting the radiance field and updating the scene graph. Eventually, we can extract the 3D scene mesh from the trained field.
3D surface reconstruction from images is essential for numerous applications. Recently, Neural Radiance Fields (NeRFs) have emerged as a promising framework for 3D modeling. However, NeRFs require accurate camera poses as input, and existing methods struggle to handle significantly noisy pose estimates (i.e., outliers), which are commonly encountered in real-world scenarios. To tackle this challenge, we present a novel approach that optimizes radiance fields with scene graphs to mitigate the influence of outlier poses. Our method incorporates an adaptive inlier-outlier confidence estimation scheme based on scene graphs, emphasizing images of high compatibility with the neighborhood and consistency in the rendering quality. We also introduce an effective intersection-over-union (IoU) loss to optimize the camera pose and surface geometry, together with a coarse-to-fine strategy to facilitate the training. Furthermore, we propose a new dataset containing typical outlier poses for a detailed evaluation. Experimental results on various datasets consistently demonstrate the effectiveness and superiority of our method over existing approaches, showcasing its robustness in handling outliers and producing high-quality 3D reconstructions.
Qualitative comparisons on the proposed dataset. As shown, our method is more robust to outlier poses, producing less distortion and better geometric detail. For the sake of space, we display the five top-performing results for each scene.
@misc{chen2024sgnerfneuralsurfacereconstruction,
title={SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization},
author={Yiyang Chen and Siyan Dong and Xulong Wang and Lulu Cai and Youyi Zheng and Yanchao Yang},
year={2024},
eprint={2407.12667},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.12667},
}