Abstract—Air quality prediction is a hot topic in the field of meteorology. Challenges still exist following consideration of the uncertainty of atmospheric pollutant emission sources, as well as the multi-dimensional, multi-scale and non-stationary characteristics of meteorological environment data. For example, traditional statistical forecasting methods usually fit a nonlinear relationship between meteorological features and pollutants, which cause that it is extremely difficult to learn their models.
To address these challenges, we propose EWA-GBDT, a novel air quality prediction model combining Exponentially Weighted Averages and Gradient Boosting Decision Tree. More specifically, we first collect two real-word datasets including: 1) the daily concentration data of six pollutants in the period from 01/01/2014 to 31/12/2016, and 2) the daily concentration data of meteorological features in the cities over Shijiazhuang and Xingtai in the period from 01/01/2014 to 02/28/2017. And then, we extract 13 types of meteorological features using Support Vector Machine Recursive Feature Elimination (SVM-RFE) method. From the respective of pollutant concentration, these features are the highest correlated with each other. Next, we apply the Exponentially Weighted Average principle to compute these above features and pollutant concentration for obtaining meteorological expectation values. Finally, considering the excellent overall prediction performance of Ensemble Learning (EL), we utilize its Gradient Boosting Decision Tree algorithm (GBDT) to predict the concentration value of pollutants and thus output the air quality level. We conducted experiments on the two datasets and the results demonstrate that EWA-GBDT outperforms other baselines methods in terms of RMSE, MAE, and R2.
Index Terms—Air quality prediction, exponentially weighted averages, gradient boosting decision tree, recursive feature elimination.
The authors are with the School of Information Science and Engineering, Yanshan University, Qinhuangdao 066044, China (e-mail: gongjibing@163.com , wangdanysu8100@163.com, chenda_ysu@163.com wangshuli@ysu.edu.cn).
[PDF]
Cite: Jibing Gong, Dan Wang, Da Chen, and Shuli Wang, "EWA-GBDT: A Novel Air Quality Prediction Model Combining Exponentially Weighted Averages and Gradient Boosting Decision Tree," International Journal of Modeling and Optimization vol. 9, no. 4, pp. 177-184, 2019.