TY - JOUR
T1 - Multi-modal deep learning approaches to semantic segmentation of mining footprints with multispectral satellite imagery
AU - Saputra, Muhamad Risqi U.
AU - Bhaswara, Irfan Dwiki
AU - Nasution, Bahrul Ilmi
AU - Ern, Michelle Ang Li
AU - Husna, Nur Laily Romadhotul
AU - Witra, Tahjudil
AU - Feliren, Vicky
AU - Owen, John R.
AU - Kemp, Deanna
AU - Lechner, Alex M.
N1 - Publisher Copyright:
© 2024
PY - 2025/3/1
Y1 - 2025/3/1
N2 - Existing remote sensing applications in mining are often of limited scope, typically mapping multiple mining land covers for a single mine or only mapping mining extents or a single feature (e.g., tailings dam) for multiple mines across a region. Many of these works have a narrow focus on specific mine land covers rather than encompassing the variety of mining and non-mining land use in a mine site. This study presents a pioneering effort in performing deep learning-based semantic segmentation of 37 mining locations worldwide, representing a range of commodities from gold to coal, using multispectral satellite imagery, to automate mapping of mining and non-mining land covers. Due to the absence of a dedicated training dataset, we crafted a customized multispectral dataset for training and testing deep learning models, leveraging and refining existing datasets in terms of boundaries, shapes, and class labels. We trained and tested multimodal semantic segmentation models, particularly based on U-Net, DeepLabV3+, Feature Pyramid Network (FPN), SegFormer, and IBM-NASA foundational geospatial model (Prithvi) architecture, with a focus on evaluating different model configurations, input band combinations, and the effectiveness of transfer learning. In terms of multimodality, we utilized various image bands, including Red, Green, Blue, and Near Infra-Red (NIR) and Normalized Difference Vegetation Index (NDVI), to determine which combination of inputs yields the most accurate segmentation. Results indicated that among different configurations, FPN with DenseNet-121 backbone, pre-trained on ImageNet, and trained using both RGB and NIR bands, performs the best. We concluded the study with a comprehensive assessment of the model's performance based on climate classification categories and diverse mining commodities. We believe that this work lays a robust foundation for further analysis of the complex relationship between mining projects, communities, and the environment.
AB - Existing remote sensing applications in mining are often of limited scope, typically mapping multiple mining land covers for a single mine or only mapping mining extents or a single feature (e.g., tailings dam) for multiple mines across a region. Many of these works have a narrow focus on specific mine land covers rather than encompassing the variety of mining and non-mining land use in a mine site. This study presents a pioneering effort in performing deep learning-based semantic segmentation of 37 mining locations worldwide, representing a range of commodities from gold to coal, using multispectral satellite imagery, to automate mapping of mining and non-mining land covers. Due to the absence of a dedicated training dataset, we crafted a customized multispectral dataset for training and testing deep learning models, leveraging and refining existing datasets in terms of boundaries, shapes, and class labels. We trained and tested multimodal semantic segmentation models, particularly based on U-Net, DeepLabV3+, Feature Pyramid Network (FPN), SegFormer, and IBM-NASA foundational geospatial model (Prithvi) architecture, with a focus on evaluating different model configurations, input band combinations, and the effectiveness of transfer learning. In terms of multimodality, we utilized various image bands, including Red, Green, Blue, and Near Infra-Red (NIR) and Normalized Difference Vegetation Index (NDVI), to determine which combination of inputs yields the most accurate segmentation. Results indicated that among different configurations, FPN with DenseNet-121 backbone, pre-trained on ImageNet, and trained using both RGB and NIR bands, performs the best. We concluded the study with a comprehensive assessment of the model's performance based on climate classification categories and diverse mining commodities. We believe that this work lays a robust foundation for further analysis of the complex relationship between mining projects, communities, and the environment.
KW - Deep learning
KW - Global mining footprints
KW - Multispectral
KW - Semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85213255366&partnerID=8YFLogxK
U2 - 10.1016/j.rse.2024.114584
DO - 10.1016/j.rse.2024.114584
M3 - Article
AN - SCOPUS:85213255366
SN - 1879-0704
VL - 318
JO - Remote Sensing of Environment
JF - Remote Sensing of Environment
M1 - 114584
ER -