Treffer: 3D Reconstruction based on multi-view stereo in the deep learning era: a survey and comparison of methods.
Weitere Informationen
This paper provides an in-depth review of deep learning methodologies to address Multi-view Stereo (MVS) challenges. Through a detailed classification of 3D reconstruction methods, we highlight the significance of multi-view 3D reconstruction. We first examine the evolution of traditional approaches, including structure from motion, MVS, and surface reconstruction. We then identify inherent limitations within the MVS stage. Subsequent sections focus on learning-based MVS methods specifically designed to tackle these challenges. Our focus is primarily on mainstream methods that utilize depth maps for 3D scene representation, building upon the traditional plane-sweeping MVS algorithm. Specifically, this review first systematically categorizes current methods by their overall architecture, and then discusses their advancements across classical pipeline stages: feature extraction, cost aggregation, cost volume regularization, depth inference, and depth fusion. Comprehensive evaluations on public datasets show that recent learning-based methods achieve significant performance gains. For instance, leading approaches such as MVSFormer + + reach an overall error as low as 0.281 mm on the DTU dataset, outperforming the baseline MVSNet by over 39%. On the Tanks and Temples benchmark, top methods achieve a mean F-score exceeding 67, demonstrating strong generalization in complex scenes. Key observations underscore the effectiveness of Transformers, implicit neural representations, and geometry-aware depth prediction, along with the promising potential of progressive refinement architectures and unsupervised approaches. Finally, we discuss existing challenges and propose promising future directions. These findings contribute to ongoing discussions on advancing monocular camera capabilities via deep learning, particularly in the context of more mobile applications being deployed on smartphones and drones. [ABSTRACT FROM AUTHOR]