首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird
Authors:Stphanie Manel  Jean-Marie Dias  Steve J Ormerod
Institution:Stéphanie Manel, Jean-Marie Dias,Steve J. Ormerod
Abstract:We assessed the occurrence of a common river bird, the Plumbeous Redstart Rhyacornis fuliginosus, along 180 independent streams in the Indian and Nepali Himalaya. We then compared the performance of multiple discrimant analysis (MDA), logistic regression (LR) and artificial neural networks (ANN) in predicting this species’ presence or absence from 32 variables describing stream altitude, slope, habitat structure, chemistry and invertebrate abundance. Using the entire data (=training set) and a threshold for accepting presence in ANN and LR set to P≥0.5, ANN correctly classified marginally more cases (88%) than either LR (83%) or MDA (84%). Model performance was assessed from two methods of data partitioning. In a ‘leave-one-out’ approach, LR correctly predicted more cases (82%) than MDA (73%) or ANN (69%). However, in a holdout procedure, all the methods performed similarly (73–75%). All methods predicted true absence (i.e. specificity in holdout: 81–85%) better than true presence (i.e. sensitivity: 57–60%). These effects reflect species’ prevalence (=frequency of occurrence), but are seldom considered in distribution modelling. Despite occurring at only 36% of the sites, Plumbeous Redstarts are one of the most common Himalayan river birds, and problems will be greater with less common species. Both LR and ANN require an arbitrary threshold probability (often P=0.5) at which to accept species presence from model prediction. Simulations involving varied prevalence revealed that LR was particularly sensitive to threshold effects. ROC plots (received operating characteristic) were therefore used to compare model performance on test data at a range of thresholds; LR always outperformed ANN. This case study supports the need to test species’ distribution models with independent data, and to use a range of criteria in assessing model performance. ANN do not yet have major advantages over conventional multivariate methods for assessing bird distributions. LR and MDA were both more efficient in the use of computer time than ANN, and also more straightforward in providing testable hypotheses about environmental effects on occurrence. However, LR was apparently subject to chance significant effects from explanatory variables, emphasising the well-known risks of models based purely on correlative data.
Keywords:Neural networks  Logistic regression  Presence–  absence data  River birds
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号