Comparison of different missing-imputation methods for MAIAC (multiangle implementation of atmospheric correction) AOD in estimating daily PM2.5 levels

Zhao-Yue Chen, Jie-Qi Jin, Rong Zhang, Tian-Hao Zhang, Jin-Jian Chen, Jun Yang, Chun-Quan Ou, Yuming Guo

Research output: Contribution to journalArticleResearchpeer-review

2 Citations (Scopus)


The immense problem of missing satellite aerosol retrievals (Aerosol Optical Depth, (AOD)) detrimentally affects the prediction ability of ground-level PM2.5 levels concentrations and may lead to unavoidable biases. An appropriate missing-imputation method has not been well developed to date. This study developed a two-stage approach (AOD-imputation stage and PM2.5 levels-prediction stage) to predict short-term PM2.5 levels exposure in mainland China from 2013-2018. At the AOD-imputation stage, geostatistical methods and machine learning (ML) algorithms were examined to interpolate 1 km satellite aerosol retrievals. At the PM2.5 levels-prediction stage, the daily levels of PM2.5 levels were predicted at a resolution of 1 km, based on interpolated AOD and meteorological data. The statistical performances of the different interpolation methods were comprehensively compared at each stage. The original coverage of retrieved AOD was 15.46% on average. For the AOD-imputation stage, ML methods produced a higher coverage (98.64%) of AOD than geostatistical methods (21.43-87.31%). Among ML algorithms, random forest (RF) or extreme gradient boosted (XG-interpolated) AOD produced better interpolated quality (CV R2 = 0.89 and 0.85) than other algorithms (0.49-0.78), but XGBoost required only 15% of the computing time of RF. For the PM2.5 levels predicted stage, neither RF-AOD nor XG-AOD could guarantee higher accuracy in PM2.5 levels estimations (CV R2 = 0.88 (RF or XG-AOD) compared to 0.85 (original)), or more stable spatial and temporal extrapolation (spatial, (temporal) CV R2 = 0.83 (0.83), 0.82 (0.82), and 0.65 (0.61) for RF, XG, and original). For the AOD-imputation stage, the missing-filled efficiency depended more on external information, while the missing-filled accuracy relied more on model structure. For the PM2.5 levels predicted stage, efficient AOD interpolation (or the ability to eliminate the missing data) was a precondition for the stable spatial and temporal extrapolation, while the quality of interpolated AOD showed less significant improvements. It was found that XG-AOD is a better choice to estimate daily PM2.5 levels exposure in health assessments.

Original languageEnglish
Article number3008
Number of pages16
JournalRemote Sensing
Issue number18
Publication statusPublished - Sep 2020


  • PM
  • Aerosol optical depth
  • Machine learning
  • Missing replacement
  • Short-term

Cite this