Title Voxel-based 3D object generation from single images using an enhanced deep learning architecture /
Authors Pocius, Algirdas ; Blažauskas, Tomas ; Butkevičiūtė, Eglė
DOI 10.15388/DAMSS.15.2024
eISBN 9786090711125
Full Text Download
Is Part of DAMSS: 15th conference on data analysis methods for software systems, Druskininkai, Lithuania, November 28-30, 2024.. Vilnius : Vilniaus universiteto leidykla, 2024. p. 78-79.. eISBN 9786090711125
Abstract [eng] Deep learning has revolutionised the field of 3D modelling by providing powerful tools for generating three-dimensional objects from various input sources, such as images, point clouds, and even textual descriptions. The ability to reconstruct accurate 3D models from limited information is crucial for numerous applications, including computer-aided design (CAD), virtual reality (VR), augmented reality (AR), and game development. However, generating precise 3D objects from single-view images remains a significant challenge due to issues like geometric complexity, occlusion, and computational cost. The aim of this study is to enhance existing computer vision and graphics methodologies by improving and optimising deep learning algorithms that enable the generation of accurate 3D objects, thereby improving the efficiency of computer-aided design. The study utilised the “ShapeNetCore (v2)” dataset, which includes over 51,000 unique 3D models across 55 different categories. A developed deep-learning architecture was used to generate 3D objects in the form of voxels from single-view images. Data processing and model training were conducted using the PyTorch framework, which offers flexibility and efficiency in building and training deep neural networks. To address challenges such as geometric complexity and occlusion, efficient data preprocessing techniques, including data augmentation and normalisation, to enhance the quality and diversity of the training data were incorporated. For evaluating the model’s performance, metrics such as Chamfer Distance and Intersection-over-Union (IoU) were applied. The Chamfer Distance quantifies the similarity between the predicted and ground truth point clouds, while the IoU measures the overlap between the predicted and actual voxel grids. Preliminary experimental results demonstrate that the proposed model effectively generates accurate 3D objects from single images, achieving an overall IoU score of 0.6549. These initial findings suggest that the model performs well across various object categories. This work contributes to the field of 3D object generation by presenting an optimised deep-learning solution that enhances the accuracy of reconstructed objects. The model’s adaptability to various object categories and its potential applications in computer-aided design, virtual reality, and game development highlights its significance in advancing 3D modelling technologies.
Published Vilnius : Vilniaus universiteto leidykla, 2024
Type Conference paper
Language English
Publication date 2024
CC license CC license description