Abstract [eng] |
Satellite imagery is used for applications like environmental monitoring, urban planning, climate change studies, infrastructure development, and disaster management. However, converting two-dimensional satellite images into realistic three-dimensional models remains a challenging task due to factors such as data quality variability, limited resolution, atmospheric distortions, and the computational limitations of current methods. The aim of this work is to investigate a region-based convolutional neural network model for object detection and classification of satellite images. Specific problems such as inconsistent quality of input data and the substantial computational requirements for processing large datasets are considered. The study involves training the enhanced R-CNN model on two datasets: a manually annotated dataset of cities in Lithuania and the “Manually Annotated High-Resolution Satellite Image Dataset of Mumbai for Semantic Segmentation.” These datasets provide a variety of urban landscapes and structural features, essential for developing a model that can generalise different geographic regions and architectural styles. The training process included 30 epochs, where the model learns to accurately detect features of various buildings. In order to evaluate the performance of the model, metrics such as the Precision-Recall (PR) curve and Average Precision (AP) score were considered, focusing on an Intersection over Union (IoU) threshold of 50%. The trained model achieved an AP score of 0.974, indicating a high level of accuracy in object detection and classification tasks. The investigation of satellite images demonstrates that with enhanced algorithms and improved processing techniques, satellite imagery can be utilised to create highly accurate, large-scale maps more efficiently. Therefore, there is great potential for future development. Further investigation should focus on refining the proposed model to handle higher-resolution data, integrating additional data sources, and constructing 3D images. |