首页 | 本学科首页   官方微博 | 高级检索  
     检索      

多机器学习模型下逐小时PM2.5预测及对比分析
引用本文:康俊锋,黄烈星,张春艳,曾昭亮,姚申君.多机器学习模型下逐小时PM2.5预测及对比分析[J].中国环境科学,2020,40(5):1895-1905.
作者姓名:康俊锋  黄烈星  张春艳  曾昭亮  姚申君
作者单位:1. 江西理工大学建筑与测绘工程学院, 江西 赣州 341000; 2. 武汉大学 中国南极测绘研究中心, 湖北 武汉 430079; 3. 重庆市万州区规划设计研究院, 重庆 404000; 4. 华东师范大学地理信息科学教育部重点实验室, 上海 200241
基金项目:国家重点研发计划项目(2016YFC0803105);国家留学基金资助项目(201808360065);江西省教育厅科学技术研究项目(GJJ150661);自然科学基金青年基金资助项目(41701462)
摘    要:为了能及时、准确的估算出PM2.5浓度及污染等级,分别构建了K最邻近模型(KNN)、BP神经网络模型(BPNN)、支持向量机回归模型(SVR)、高斯过程回归模型(GPR)、XGBoost模型和随机森林模型(RF)6个PM2.5浓度预测模型,选取江西省赣州市为实验区域,采用2017~2018年逐小时气象站数据、PM2.5浓度数据和Merra-2再分析数据开展PM2.5预测实验.结果表明,缺少污染物观测数据时,利用能见度和气象因子等数据也能较好的预测PM2.5浓度.在PM2.5浓度预测精度方面,XGBoost模型最高,随机森林模型次之,高斯过程回归模型最差.6个模型的预测精度总体呈现冬季最高,秋季和春季次之,夏季最低.XGBoost模型的PM2.5污染等级预测准确率高于其他模型,综合准确率达87.6%,并且XGBoost模型具有训练时间短,占用内存小等优点.XGBoost模型的变量重要性结果表明,能见度变量的重要性最高,相对湿度和时间变量次之.本研究可为环境部门准确预测、预报PM2.5浓度提供参考.

关 键 词:PM2.5预测  能见度  机器学习  XGBoost  气象因子  
收稿时间:2019-10-08

Hourly PM2.5 prediction and its comparative analysis under multi-machine learning model
KANG Jun-Feng,HUANG Lie-Xing,ZHANG Chun-Yan,ZENG Zhao-Liang,YAO Shen-Jun.Hourly PM2.5 prediction and its comparative analysis under multi-machine learning model[J].China Environmental Science,2020,40(5):1895-1905.
Authors:KANG Jun-Feng  HUANG Lie-Xing  ZHANG Chun-Yan  ZENG Zhao-Liang  YAO Shen-Jun
Institution:1. School of Architecture and Surveying Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China; 2. Chinese Antarctic Center of Surveying and Mapping, Wuhan University, Wuhan 430079, China; 3. Chongqing Wanzhou District Planning and Design Institute, Chongqing 404000, China; 4. key Laboratory of Geographic Information Science, Ministry of Education, Shanghai 200241, China
Abstract:Six models were built for timely and accurate estimation of PM2.5 concentration and pollution levels, namely K Nearest Neighbor (KNN) model, BP Neural Network (BPNN) model, Support Vector Machine (SVM) regression model, Gaussian Process Regression (GPR) model, XGBoost model and Random forest(RF) model. Ganzhou City of Jiangxi Province was selected as the study area. The hourly ground-based meteorological data, PM2.5 concentration data and Merra-2reanalysis data from 2017 to 2018 were used for modelling. The results show that PM2.5 concentration can also be predicted using visibility and meteorological data when pollutant observation data are missing. In terms of the prediction accuracy of PM2.5 concentration, the XGBoost model performs best, followed by the RF model, and the GPR model is the worst. The prediction accuracy of the six models was generally highest in winter, followed by autumn and spring, and lowest in summer. Compared with other models, the XGBoost model exhibits a more accurate prediction performance for PM2.5 pollution level prediction with the comprehensive accuracy rate of 87.6%. Moreover, XGBoost model has the advantages of short training and small memory consumption. Visibility (followed by the relative humidity and time variable) play a key factor in the XGBoost models for PM2.5 concentration prediction. This study can provide a reference for environmental departments to accurately predict and forecast PM2.5 concentration.
Keywords:PM2  5 prediction  visibility  machine learning  XGBoost  meteorological factor  
本文献已被 CNKI 等数据库收录!
点击此处可从《中国环境科学》浏览原始摘要信息
点击此处可从《中国环境科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号