Sajid, Moiz (2021) Multiview 3D Shape Reconstruction using Deep Learning. DLR-Interner Bericht. DLR-IB-RM-OP-2021-226. Masterarbeit. Technische Universität München. 56 S.
PDF
15MB |
Kurzfassung
Deep learning has revolutionized computer vision through recent developments on tasks in this field. Although these developments initially started with 2D images, progress has been made recently in 3D computer vision. Tasks such as inferring the 3D shape from multiple images have also gained immense popularity recently due to the breakthroughs in the field of 3D deep learning. These advancements are made possible firstly, by the availability of large 3D object datasets, for example, ShapeNet, Pix3D, and ModelNet, secondly, by network architectures that can better handle 3D data, for example, DeepSDF, ShapeHD, and PSG, and thirdly, by the accessibility of efficient computing resources for processing 3D data. Humans can actively infer the 3D world around them with just a single view of a scene. However, unlike humans, for computers the same task of estimating 3D information with just a single view becomes challenging because the single view reconstruction problem is generally ill-posed and ambiguous. Instead of perceiving the object of interest from one viewpoint, computers are provided with images from multiple viewpoints so that they can better reconstruct the 3D geometry of the object present in the images. The goal of this thesis is to present and evaluate a multiview 3D shape reconstruction method for reconstructing the 3D environments better. More specifically, a sparse number of input images are provided to the proposed method to get an object's representation in 3D. The reconstructions from these methods is crucial in applications such as virtual/augmented reality, autonomous driving, and robotic manipulation and grasping. To this end, this thesis firstly proposes a large scale multiview dataset with 1,050,816 rendered images and 43,784 3D Truncated Signed Distance Function (TSDF) volumes based upon the ShapeNet dataset, including accurate camera pose and intrinsic parameters. Secondly, a novel 2D-3D end-to-end trainable deep learning-based method for 3D shape reconstruction is presented using images taken from multiple viewpoints and camera parameters. The method maps the 2D features directly into 3D using a backprojection layer. Finally, detailed evaluation studies are conducted using the proposed multiview 3D shape reconstruction approach on the newly introduced dataset.
elib-URL des Eintrags: | https://elib.dlr.de/146800/ | ||||||||
---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Berichtsreihe (DLR-Interner Bericht, Masterarbeit) | ||||||||
Titel: | Multiview 3D Shape Reconstruction using Deep Learning | ||||||||
Autoren: |
| ||||||||
Datum: | November 2021 | ||||||||
Referierte Publikation: | Nein | ||||||||
Open Access: | Ja | ||||||||
Seitenanzahl: | 56 | ||||||||
Status: | veröffentlicht | ||||||||
Stichwörter: | Deep Learning, Multiview Reconstruction, Machine Learning | ||||||||
Institution: | Technische Universität München | ||||||||
Abteilung: | Fakultät für Informatik | ||||||||
HGF - Forschungsbereich: | Luftfahrt, Raumfahrt und Verkehr | ||||||||
HGF - Programm: | Raumfahrt | ||||||||
HGF - Programmthema: | Robotik | ||||||||
DLR - Schwerpunkt: | Raumfahrt | ||||||||
DLR - Forschungsgebiet: | R RO - Robotik | ||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | R - Multisensorielle Weltmodellierung (RM) [RO] | ||||||||
Standort: | Oberpfaffenhofen | ||||||||
Institute & Einrichtungen: | Institut für Robotik und Mechatronik (ab 2013) > Perzeption und Kognition | ||||||||
Hinterlegt von: | Denninger, Maximilian | ||||||||
Hinterlegt am: | 07 Dez 2021 09:07 | ||||||||
Letzte Änderung: | 20 Dez 2021 10:16 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags