首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于ARIMA和XGBoost组合模型的交通事故预测
引用本文:谢学斌,孔令燕.基于ARIMA和XGBoost组合模型的交通事故预测[J].安全与环境学报,2021,21(1):277-284.
作者姓名:谢学斌  孔令燕
作者单位:中南大学资源与安全工程学院,长沙410083
摘    要:近年来交通事故及其损失严重影响社会经济的发展和人民生活的提高,交通事故预测可以为交通事故预防提供数据支持。基于自回归滑动平均(ARIMA)模型和极端梯度提升(XGBoost)模型,构建时间序列组合预测模型,对交通事故相关指标进行趋势预测。根据交通事故的特点,选定"事故起数""受伤人数""死亡人数"及"损失"4个指标。首先,根据自相关、偏自相关图确定ARIMA模型参数,根据AIC(赤池信息准则)值确定最终模型;然后,对4个指标的ARIMA模型预测结果的残差构建残差序列,对其进行XGBoost建模,得出修正后的残差预测值;最后,根据残差预测值和ARIMA模型预测值得出组合模型最终的预测值。实例结果表明,4项指标的混合预测模型的预测精度均优于单一的ARIMA模型和Holt-winters模型,其中以"受伤人数"和"死亡人数"的模型改善效果最为显著,"受伤人数"指标的平均绝对百分比误差降低了5.431 7个百分点,"死亡人数"指标的平均绝对百分比误差降低了3.625 9个百分点。

关 键 词:安全工程  交通事故预测  时间序列  ARIMA  XGBoost

On the ways to the traffic accident prediction based on the ARIMA and XGBoost combined model
XIE Xue-bin,KONG Ling-yan.On the ways to the traffic accident prediction based on the ARIMA and XGBoost combined model[J].Journal of Safety and Environment,2021,21(1):277-284.
Authors:XIE Xue-bin  KONG Ling-yan
Institution:(School of Resources and Safety Engineering,Central South Uni-versity,Changsha 410083,China)
Abstract:The present paper intends to study and find more ways in the traffic accident prediction so as to provide data supports for traffic accident prevention. Based on the ARIMA model and the XGBoost model,the paper intends to construct a combined time series for predicting the trends of the traffic accidents related 4 indicators,that is,"the number of accidents","the number of injured","the number of the death casualty"and"the material loss"to be chosen and determined. According to the characteristic features of the traffic accidents,it is necessary first of all to check and determine whether the ADF test has been performed to determine the stationarity of the original sequence diagram. And,for the said determination of the stationarity of the original sequence diagram,it would be necessary to build up the ARIMA model parameters and generate an autocorrelation plot and a partial autocorrelation plot to determine the differential sequences based on the AIC values and choose the optimal model. At the same time,it is also necessary to test the model residuals to check if the differential sequences can meet the normal distribution demands,so that the sequences of the 4 indicators can be forecast and predicted by the determined ARIMA model to obtain the residual prediction values based on the predicted results and the true value by modeling the residual sequences via the XGBoost. Thus,finally,the eventually predicted value of the combined model can be derived based on the residual value predicted and the ARIMA model forecast. The case study sampling results we have gained tend to show that the prediction accuracy of the mixed forecast model of the 4 indicators tend to be excessive beyond that of the single ARIMA model and the Holt-Winters model. The model of"the injured number"and the model of"death casualty rate"can be said beyond the significant improvement effect,with the absolute average percentage error of the"injury rate"can be reduced by 5. 431 7 percentage point,whereas the average absolute percentage error of the "number of the death casualty"can be reduced by 3. 625 9 percentage point.Besides,the average absolute percentage error of the"number of accidents"can be reduced by 2. 560 9 percentage point,while the average absolute percentage error of the "loss"can be cut down by 0. 627 2 percentage point. Thus,the combined model we have proposed in this paper shall be able to provide a novel method for traffic accident prediction.
Keywords:safety engineering  traffic accident prediction  time series  ARIMA  XGBoost
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号