首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 35 毫秒
1.
Two statistical modelling techniques, generalized additive models (GAM) and multivariate adaptive regression splines (MARS), were used to analyse relationships between the distributions of 15 freshwater fish species and their environment. GAM and MARS models were fitted individually for each species, and a MARS multiresponse model was fitted in which the distributions of all species were analysed simultaneously. Model performance was evaluated using changes in deviance in the fitted models and the area under the receiver operating characteristic curve (ROC), calculated using a bootstrap assessment procedure that simulates predictive performance for independent data. Results indicate little difference between the performance of GAM and MARS models, even when MARS models included interaction terms between predictor variables. Results from MARS models are much more easily incorporated into other analyses than those from GAM models. The strong performance of a MARS multiresponse model, particularly for species of low prevalence, suggests that it may have distinct advantages for the analysis of large datasets. Its identification of a parsimonious set of environmental correlates of community composition, coupled with its ability to robustly model species distributions in relation to those variables, can be seen as converging strongly with the purposes of traditional ordination techniques.  相似文献   

2.
Statistical methods as developed and used in decision making and scientific research are of recent origin. The logical foundations of statistics are still under discussion and some care is needed in applying the existing methodology and interpreting results. Some pitfalls in statistical data analysis are discussed and the importance of cross examination of data (or exploratory data analysis) before using specific statistical techniques are emphasized. Comments are made on the treatment of outliers, choice of stochastic models, use of multivariate techniques and the choice of software (expert systems) in statistical analysis. The need for developing new methodology with particular relevance to environmental research and policy is stressed.Dr Rao is Eberly Professor of Statistics and Director of the Penn State Center for Multivariate Analysis. He has received PhD and ScD degrees from Cambridge University, and has been awarded numerous honorary doctorates from universities around the world. He is a Fellow of Royal Society, UK; Fellow of Indian National Science Academy; Foreign Honorary Member of American Academy of Arts and Science; Life Fellow of King's College, Cambridge; and Founder Fellow of the Third World Academy of Sciences. He is Honorary Fellow and President of International Statistical Institute, Biometric Society and elected Fellow of the Institute of Mathematical Statistics. He has made outstanding contributions to virtually all important topics of theoretical and applied statistics, and many results bear his name. He has been Editor of Sankhya and theJournal of Multivariate Analysis, and serves on international advisory boards of several professional journals, includingEnvironmetrics and theJournal of Environmental Statistics. This paper is based on the keynote address to the Seventh Annual Conference on Statistics of the United States Environmental Protection Agency.  相似文献   

3.
New approaches to modelling fish-habitat relationships   总被引:1,自引:0,他引:1  
Ecologists often develop models that describe the relationship between faunal communities and their habitat. Coral reef fishes have been the focus of numerous such studies, which have used a wide range of statistical tools to answer an equally wide range of questions. Here, we apply a series of both conventional statistical techniques (linear and generalized additive regression models) and novel machine-learning techniques (the support vector machine and three ensemble techniques used with regression trees) to predict fish species richness, biomass, and diversity from a range of habitat variables. We compare the techniques in terms of their predictive performance, and we compare a subset of the models in terms of the influence each habitat variable has for the predictions. Prediction errors are estimated by cross-validation, and variable importance is assessed using permutations of individual variable values. For predictions of species richness and diversity the tree-based models generally and the random forest model specifically are superior (produce the lowest errors). These model types are all able to model both nonlinear and interaction effects. The linear model, unable to model either effect type, performs the worst (produces the highest errors). For predictions of biomass, the generalized additive model is superior, and the support vector machine performs the worst. Depth range, the difference between maximum and minimum water depth at a given site, is identified as the most important variable in the majority of models predicting the three fish community variables. However, variable importance is highly dependent upon model type, which leads to questions regarding the interpretation of variable importance and its proper use as an indicator of causality. The representation of ecological relationships by tree-based ensemble learners will improve predictive performance, and provide a new avenue for exploring ecological relationships, both statistical and causal.  相似文献   

4.
Abstract:  Over the last decade, criticisms of null-hypothesis significance testing have grown dramatically, and several alternative practices, such as confidence intervals, information theoretic, and Bayesian methods, have been advocated. Have these calls for change had an impact on the statistical reporting practices in conservation biology? In 2000 and 2001, 92% of sampled articles in Conservation Biology and Biological Conservation reported results of null-hypothesis tests. In 2005 this figure dropped to 78%. There were corresponding increases in the use of confidence intervals, information theoretic, and Bayesian techniques. Of those articles reporting null-hypothesis testing—which still easily constitute the majority—very few report statistical power (8%) and many misinterpret statistical nonsignificance as evidence for no effect (63%). Overall, results of our survey show some improvements in statistical practice, but further efforts are clearly required to move the discipline toward improved practices.  相似文献   

5.
The analysis of habitat selection in radio-tagged animals is approached by comparing the portions of use against the portions of availability observed for each habitat type. Since data are linearly dependent with singular variance-covariance matrices, standard multivariate statistical tests cannot be applied. To bypass the problem, compositional data analysis is customarily performed via log-ratio transform of sample observations. The procedure is criticized in this paper, emphasizing the several drawbacks which may arise from the use of compositional analysis. An alternative nonparametric solution is proposed in the framework of multiple testing. The habitat use is assessed separately for each habitat type by means of the sign test performed on the original observations. The resulting p values are combined in an overall test statistic whose significance is determined permuting sample observations. The theoretical findings of the paper are checked by simulation studies. Applications to case studies previously considered in literature are discussed.  相似文献   

6.
Clough Y 《Ecology》2012,93(8):1809-1815
The need to model and test hypotheses about complex ecological systems has led to a steady increase in use of path analytical techniques, which allow the modeling of multiple multivariate dependencies reflecting hypothesized causation and mechanisms. The aim is to achieve the estimation of direct, indirect, and total effects of one variable on another and to assess the adequacy of whole models. Path analytical techniques based on maximum likelihood currently used in ecology are rarely adequate for ecological data, which are often sparse, multi-level, and may contain nonlinear relationships as well as nonnormal response data such as counts or proportion data. Here I introduce a more flexible approach in the form of the joint application of hierarchical Bayes, Markov chain Monte Carlo algorithms, Shipley's d-sep test, and the potential outcomes framework to fit path models as well as to decompose and estimate effects. An example based on the direct and indirect interactions between ants, two insect herbivores, and a plant species demonstrates the implementation of these techniques, using freely available software.  相似文献   

7.
The statistical literature contains many univariate and multivariate skewness measures that allow two datasets to be compared, some of which are defined in terms of quantile values. In most situations, the comparison between two random vectors focuses on univariate comparisons of conditional random variables truncated in quantiles; this kind of comparison is of particular interest in the environmental sciences. In this work, we describe a new approach to comparing skewness in terms of the univariate convex transform ordering proposed by van Zwet (Convex transformations of random variables. Mathematical Centre Tracts, Amsterdam, 1964), associated with skewness as well as concentration. The key to these comparisons is the underlying dependence structure of the random vectors. Below we describe graphical tools and use several examples to illustrate these comparisons.  相似文献   

8.
A large number of data (n=488) were acquired from 2003 to 2006 for five Italian harbours and three control areas to determine trace element (Al, As, Cd, Cu, Zn, Pb, Hg, Cr, and Ni) levels in sediments. Results were utilised to evaluate, on a multivariate statistical basis, pollution levels, significant relationships between observed levels and specific factors, and enrichment factors. Of the factors tested, main human use of harbour, was best able to determine segregations in the observed trace element fingerprints. Compared with the concentration limit approach, the evaluation of enrichment factors, even if affected by mathematical approximations, represented a useful tool for environmental studies, allowing evaluation of the presence of sediments enriched by human activities and reducing the occurrence of both false positives and false negatives due to natural differences in aluminosilicate levels.  相似文献   

9.
《Ecological modelling》2005,186(3):280-289
Increasing use is being made in conservation management of statistical models that couple extensive collections of species and environmental data to make predictions of the geographic distributions of species. While the relationships fitted between a species and its environment are relatively transparent for many of these modeling techniques, others are more ‘black box’ in character, only producing geographic predictions and providing minimal or untraditional summaries of the fitted relationships on which these predictions are based. This in turn prevents robust evaluation of the ecological sensibility of such models, a necessary process if model predictions are to be treated with confidence. Here we propose a new but simple method for visualizing modeled responses that can be implemented with any modeling method, and demonstrate its application using five common methods applied to the prediction of an Australian tree species. This is achieved by insetting an “evaluation strip” into the spatial data layers, which, after predictions have been made, can be clipped out and used for creating plots of the modelled responses. We present findings of the application strip for algorithms GLMs, GAMs, CLIM, DOMAIN and MARS. Evaluation strips can be constructed to investigate either uni-variate responses, or the simultaneous variation in predicted values in relation to two variables. The latter option is particularly useful for evaluating responses in models that allow the fitting of complex interaction terms.  相似文献   

10.
Abstract:  Multivariate classifications of environmental factors are used as frameworks for conservation management. Although classification performance is likely to be sensitive to choice of input variables, these choices have been subjective in most previous studies. We used the Mantel test on a limited set of sites for which biological data were available to iteratively seek a definition of environmental space (i.e., intersite distances calculated with a set of appropriately transformed and weighted environmental variables) that had maximal correlation with the same sites described in a biological space. The procedure was used to select input variables for a classification of New Zealand's rivers that discriminates variation in fish communities for biodiversity management. The classification performed (i.e., discriminated biological variation) better than classifications with subjectively chosen variables. The inherently linear measures of environmental distance that underlie multivariate environmental classifications mean that they will perform best if they are defined based on variables for which there is a linear variation in the biological community throughout the entire range of the variable. Classification performance will therefore be improved when variables that have nonlinear relationships with biological variation are transformed to make their relationship with biological turnover more linear and when the contributions of environmental factors that have particularly strong relationships with biological variation are increased by weighting. Our results indicate that attention to the manner in which environmental space is defined improves the efficacy of multivariate classification and other techniques in which the environment is used as a surrogate for biological variation.  相似文献   

11.
Bayesian Methods in Conservation Biology   总被引:10,自引:0,他引:10  
Abstract: Bayesian statistical inference provides an alternate way to analyze data that is likely to be more appropriate to conservation biology problems than traditional statistical methods. I contrast Bayesian techniques with traditional hypothesis-testing techniques using examples applicable to conservation. I use a trend analysis of two hypothetical populations to illustrate how easy it is to understand Bayesian results, which are given in terms of probability. Bayesian trend analysis indicated that the two populations had very different chances of declining at biologically important rates. For example, the probability that the first population was declining faster than 5% per year was 0.00, compared to a probability of 0.86 for the second population. The Bayesian results appropriately identified which population was of greater conservation concern. The Bayesian results contrast with those obtained with traditional hypothesis testing. Hypothesis testing indicated that the first population, which the Bayesian analysis indicated had no chance of declining at > 5% per year, was declining significantly because it was declining at a slow rate and the abundance estimates were precise. Despite the high probability that the second population was experiencing a serious decline, hypothesis testing failed to reject the null hypothesis of no decline because the abundance estimates were imprecise. Finally, I extended the trend analysis to illustrate Bayesian decision theory, which allows for choice between more than two decisions and allows explicit specification of the consequences of various errors. The Bayesian results again differed from the traditional results: the decision analysis led to the conclusion that the first population was declining slowly and the second population was declining rapidly.  相似文献   

12.
13.
《Ecological modelling》2007,200(1-2):1-19
Given the importance of knowledge of species distribution for conservation and climate change management, continuous and progressive evaluation of the statistical models predicting species distributions is necessary. Current models are evaluated in terms of ecological theory used, the data model accepted and the statistical methods applied. Focus is restricted to Generalised Linear Models (GLM) and Generalised Additive Models (GAM). Certain currently unused regression methods are reviewed for their possible application to species modelling.A review of recent papers suggests that ecological theory is rarely explicitly considered. Current theory and results support species responses to environmental variables to be unimodal and often skewed though process-based theory is often lacking. Many studies fail to test for unimodal or skewed responses and straight-line relationships are often fitted without justification.Data resolution (size of sampling unit) determines the nature of the environmental niche models that can be fitted. A synthesis of differing ecophysiological ideas and the use of biophysical processes models could improve the selection of predictor variables. A better conceptual framework is needed for selecting variables.Comparison of statistical methods is difficult. Predictive success is insufficient and a test of ecological realism is also needed. Evaluation of methods needs artificial data, as there is no knowledge about the true relationships between variables for field data. However, use of artificial data is limited by lack of comprehensive theory.Three potentially new methods are reviewed. Quantile regression (QR) has potential and a strong theoretical justification in Liebig's law of the minimum. Structural equation modelling (SEM) has an appealing conceptual framework for testing causality but has problems with curvilinear relationships. Geographically weighted regression (GWR) intended to examine spatial non-stationarity of ecological processes requires further evaluation before being used.Synthesis and applications: explicit theory needs to be incorporated into species response models used in conservation. For example, testing for unimodal skewed responses should be a routine procedure. Clear statements of the ecological theory used, the nature of the data model and sufficient details of the statistical method are needed for current models to be evaluated. New statistical methods need to be evaluated for compatibility with ecological theory before use in applied ecology. Some recent work with artificial data suggests the combination of ecological knowledge and statistical skill is more important than the precise statistical method used. The potential exists for a synthesis of current species modelling approaches based on their differing ecological insights not their methodology.  相似文献   

14.
15.

Background

The United Nations Framework Convention on Climate Change recognizes carbon (C) fixation in forests as an important contribution for the reduction of atmospheric pollution in terms of greenhouse gases. Spatial differentiation of C sequestration in forests either at the national or at the regional scale is therefore needed for forest planning purposes. Hence, within the framework of the Forest Focus regulation, the aim of this investigation was to statistically analyse factors influencing the C fixation and to use the corresponding associations in terms of a predictive mapping approach at the regional scale by example of the German federal state North Rhine-Westphalia. The results of the methodical scheme outlined in this article should be compared with an already-published approach applied to the same data which were used in the investigation at hand.

Methods

Site-specific data on C sequestration in humus, forest trees/dead wood and soil from two forest monitoring networks were intersected with available surface information on topography, soil, climate and forestal growing areas and districts. Next, the association between the C sequestration and the influence factors were examined and modelled by linear regression analyses. The resulting regression equations were applied on the surface data to predicatively map the C sequestration for the entire study area.

Results

The computations yielded an estimation of 146.7 mio t C sequestered in the forests of North Rhine-Westphalia corresponding to 168.6 t/ha. The calculated values correspond well to according specifications given by the literature. Furthermore, the results are almost identical to those of another pilot study where a different statistical methodology was applied on the same database. Nevertheless, the underlying regression models contribute only a low degree of explanation to the overall variance of the C fixation. This might mainly be due to data quality aspects and missing influence factors in the analyses.

Discussion

In another study, an alternative approach was introduced to map the spatial differentiation of C sequestration in North Rhine-Westphalia based on the combination of geostatistics, decision tree analyses and GIS techniques. As a result, the overall mean of C sequestration amounted for 177 t C/ha which is 8.4 t C/ha higher than what was calculated in the study at hand and 14 t C/ha below the roughly guessed German-wide mean of 191 t C/ha.

Conclusions

The surface estimations of C pools in living forest trees/dead wood, the humus layer and the mineral soil enable to map the fixation of the greenhouse gas CO2 in forests at the regional scale. The estimations that were derived in this study are in good accordance with estimations based on techniques which, in contrast, did neither allow for spatial differentiation nor for mapping. The presented approach should be validated by application of other statistical techniques and by use of German wide inventory data. Furthermore, C sequestration should be modelled according to different climate change scenarios by combining statistical methods and dynamic modelling.  相似文献   

16.
Mineral natural waters and spas have been used for therapeutic purposes for centuries, with Portugal being a very rich country in thermal waters and spas that are mainly distributed by northern and central regions where Beira Interior region is located. The use of thermal waters for therapeutic purposes has always been aroused a continuous interest, being dependent on physicochemical fingerprinting of this type of waters the indication for a treatment in a specific pathological condition. In the present work, besides a literature review about the physicochemical composition of the thermal waters of the Beira Interior region and its therapeutic indications, it was carried out an exhaustive multivariate analysis—principal component analysis and cluster analysis—to assess the correlation between different physicochemical parameters and the therapeutic indications claims described for these spas and thermal waters. These statistical methods used for data analysis enables classification of thermal waters compositions into different groups, regarding to the different variable selected, making possible an interpretation of variables affecting water compositions. Actually, Monfortinho and Longroiva are clearly quite different of the others, and Cró and Fonte Santa de Almeida appear together in all analysis, suggesting a strong resemblance between these waters. Thereafter, the results obtained allow us to demonstrate the role of major components of the studied thermal waters on a particular therapeutic purpose/indication and hence based on compositional and physicochemical properties partially explain their therapeutic qualities and beneficial effects on human health. This classification agreed with the results obtained for the therapeutic indications approved by the Portuguese National Health Authority and proved to be a valuable tool for the regional typology of mineral medicinal waters, constituting an important guide of the therapeutic armamentarium for well and specific-oriented pathological disturbs.  相似文献   

17.
18.
Global positioning system (GPS) collars have revolutionized the collection of animal location data; however, it is well-recognized that considerable bias can be present in these data due to habitat or behavior-induced obstruction of satellite signals resulting in inaccurate or missing locations. To date, no explicit theoretical framework of GPS fix acquisition specific to animal telemetry has been presented, and studies make differing assumptions regarding factors influencing GPS fix acquisition and how these data should be analyzed. Inappropriate statistical models have been used, interaction effects have been misunderstood, and the implementation of bias mitigation techniques has been problematic. Herein we outline current conceptual and analytical problems in the GPS animal telemetry literature, and subsequently present a theoretical model-based framework for GPS fix acquisition that clarifies the single and interactive effects of habitat and behavioral obstruction, fix interval, and collar model on GPS collar performance. By recognizing that GPS fix acquisition is a Bernoulli process, it becomes apparent that all forms of obstruction inherently interact with each other, making generalizations across study areas, study species, and collar models problematic. Stationary collar tests to determine the probability of fix acquisition (PFA), location accuracy, and the response to sources of obstruction are thus of limited applicability to animal-deployed collars. Bias mitigation techniques that extrapolate PFA models across samples, especially those using stationary collar tests to correct animal-deployed collars, are theoretically unsound. It is also demonstrated that nonlinearities in the relationships between sources of obstruction and PFA complicate PFA modeling with limited data and that even slight model misspecification can lead to considerable errors in correction factors, especially when using inverse weighting to mitigate bias. By emphasizing the importance of GPS collar sensitivity and ephemeris retention, the theoretical framework predicts that newer, more sensitive GPS collars will be less severely biased by sources of obstruction than reported for the older, less sensitive collars that have been used in the majority of GPS performance studies to date and we expect this trend to continue. This heuristic modeling exercise should be of value to researchers planning and analyzing studies using GPS collars and it also establishes a starting point for future theoretical investigations into GPS collar performance and bias mitigation.  相似文献   

19.
Correspondence analysis with linear external constraints on both the rows and the columns has been mentioned in the ecological literature, but lacks full mathematical treatment and easily available algorithms and software. This paper fills this gap by defining the method as maximizing the fourth-corner correlation between linear combinations, by providing novel algorithms, which demonstrate relationships with related methods, and by making a detailed study of possible biplots and associated approximations. The method is illustrated using ecological data on the abundances of species in sites and where the species are characterized by traits and sites by environmental variables. The trait data and environment data form the external constraints and the question is which traits and environmental variables are associated, how these associations drive species abundances and how they can be displayed in biplots. With microbiome data becoming widely available, these and related multivariate methods deserve more study as they might be routinely used in the future.  相似文献   

20.
J. Badia  T. Do Chi 《Marine Biology》1976,36(2):159-168
The structure and evolution of populations of Squilla mantis (Crustacea: Stomatopoda) have been studied, and the statistical techniques used are discussed. Population studies were performed by multivariate analysis. Thirty-three monthly samples comprising 20 length classes of males and females were examined. Males and females display evolutionary cycles: each cycle begins in June/July and proceeds through a succession of evolution and eventual regression, the later being particularly marked every 2 years.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号