首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Feature selection approaches for predictive modelling of cadmium sources and pollution levels in water springs
Authors:Abu Salem  Fatima K  Jurdi  Mey  Alkadri  Mohamad  Hachem  Firas  Dhaini  Hassan R
Institution:1.Department of Computer Science, Faculty of Arts and Sciences, American University of Beirut, Beirut, Lebanon
;2.Department of Environmental Health, Faculty of Health Sciences, American University of Beirut, P.O. Box 11-0236, Riad El-Solh, Beirut, 1107 2020, Lebanon
;
Abstract:

The World Health Organization lists cadmium (Cd) as one of the top ten chemicals of public health concern. Cd is toxic at relatively low exposure levels and has acute and chronic effects on both health and the environment. In this study, we investigate a suite of data-driven methods that could assist decision-makers in estimating Cd levels in water springs, and in identifying polluting sources. Machine learning (ML) regression models were used to identify sources of contamination and predict Cd levels based on support vector machines and a variety of tree-based models, including Random Forests, M5Tree, CatBoost, and gradient boosting. Feature selection analysis revealed that heavy traffic and distance to a major power plant in the sampled area play a leading role in springs Cd contamination, together with precipitation levels and average of slopes of the closest waste dumps upstream to sampled springs. Our best performing ML model was the Adaboost regression tree using all the features (RMSE = 19.36, R^2 = 0.64). Our findings highlight the effectiveness of predictive data-driven modeling in addressing environmental challenges, particularly in high-risk areas with low resources.

Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号