Extreme gradient boosting model to estimate PM 2.5 concentrations with missing-filled satellite data in China

Zhao Yue Chen, Tian Hao Zhang, Rong Zhang, Zhong Min Zhu, Jun Yang, Ping Yan Chen, Chun Quan Ou, Yuming Guo

Research output: Contribution to journalArticleResearchpeer-review

92 Citations (Scopus)


Several studies have attempted to predict ground PM 2.5 concentrations using satellite aerosol optical depth (AOD) retrieval. However, over 70%–90% of aerosol retrievals are non-random missing, which limits and biases the estimation. To the best of our knowledge, this issue has not been well resolved to date. The aim of this study was to develop an interpolation technique to handle the missing data retrieval problem and to estimate the daily PM 2.5 for a high coverage dataset with 3-km resolution in China by fitting the complex temporal and spatial variations. We developed a two-step interpolation method (i.e., the mixed-effect model and inverse distance weighting technology) to replace the missing values in AOD. Next, the extreme gradient boosting (XGBoost) technique that includes a non-linear exposure-lag-response model (NELRM) was proposed and validated to estimate the daily levels of PM 2.5 across China during 2014–2015. After two steps of interpolation, the missing value rate of daily AOD data was reduced from 87.91% to 13.83%. The cross-validation (CV) R-square, root mean square error (RMSE) and mean absolute percentage prediction error (MAPE) of the interpolation were 0.76, 0.10 and 21.41%, respectively. The cross-validation for the prediction of daily PM 2.5 resulted in R 2 = 0.86, RMSE = 14.98, and MAPE = 23.72%. The results of this study indicate that the two-step interpolation method can largely resolve the non-random missing data problem and that the combined XGBoost methods have a good ability to estimate fine particulate matter concentrations.

Original languageEnglish
Pages (from-to)180-189
Number of pages10
JournalAtmospheric Environment
Publication statusPublished - 1 Apr 2019


  • Aerosol optical depth
  • China
  • Extreme gradient boosting
  • Missing replacement

Cite this