首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Geostatistical models play an important role in spatial data analysis, in which model selection is inevitable. Model selection methods, such as AIC and BIC, are popular for selecting appropriate models. In recent years, some model averaging methods, such as smoothed AIC and smoothed BIC, are also applied to spatial data models. However, the corresponding averaging estimators are outperformed by optimal model averaging estimators (Hansen in Econometrica 75:1175–1189, 2007) for the ordinary linear models. Therefore, this paper focuses on the optimal model averaging method for geostatistical models. We propose a weight choice criterion for the model averaging estimator on the basis of the generalized degrees of freedom and data perturbation technique. We further theoretically prove the resultant estimator is asymptotically optimal in terms of the mean squared error, and numerically demonstrate its satisfactory performance. Finally, the proposed method is applied to a mercury data set.  相似文献   

2.
Behavioural ecologists often study complex systems in which multiple hypotheses could be proposed to explain observed phenomena. For some systems, simple controlled experiments can be employed to reveal part of the complexity; often, however, observational studies that incorporate a multitude of causal factors may be the only (or preferred) avenue of study. We assess the value of recently advocated approaches to inference in both contexts. Specifically, we examine the use of information theoretic (IT) model selection using Akaike’s information criterion (AIC). We find that, for simple analyses, the advantages of switching to an IT-AIC approach are likely to be slight, especially given recent emphasis on biological rather than statistical significance. By contrast, the model selection approach embodied by IT approaches offers significant advantages when applied to problems of more complex causality. Model averaging is an intuitively appealing extension to model selection. However, we were unable to demonstrate consistent improvements in prediction accuracy when using model averaging with IT-AIC; our equivocal results suggest that more research is needed on its utility. We illustrate our arguments with worked examples from behavioural experiments.  相似文献   

3.
There has been a great deal of recent discussion of the practice of regression analysis (or more generally, linear modelling) in behaviour and ecology. In this paper, I wish to highlight two factors that have been under-considered, collinearity and measurement error in predictors, as well as to consider what happens when both exist at the same time. I examine what the consequences are for conventional regression analysis (ordinary least squares, OLS) as well as model averaging methods, typified by information theoretic approaches based around Akaike’s information criterion. Collinearity causes variance inflation of estimated slopes in OLS analysis, as is well known. In the presence of collinearity, model averaging reduces this variance for predictors with weak effects, but also can lead to parameter bias. When collinearity is strong or when all predictors have strong effects, model averaging relies heavily on the full model including all predictors and hence the results from this and OLS are essentially the same. I highlight that it is not safe to simply eliminate collinear variables without due consideration of their likely independent effects as this can lead to biases. Measurement error is also considered and I show that when collinearity exists, this can lead to extreme biases when predictors are collinear, have strong effects but differ in their degree of measurement error. I highlight techniques for dealing with and diagnosing these problems. These results reinforce that automated model selection techniques should not be relied on in the analysis of complex multivariable datasets.  相似文献   

4.
Model averaging, specifically information theoretic approaches based on Akaike’s information criterion (IT-AIC approaches), has had a major influence on statistical practices in the field of ecology and evolution. However, a neglected issue is that in common with most other model fitting approaches, IT-AIC methods are sensitive to the presence of missing observations. The commonest way of handling missing data is the complete-case analysis (the complete deletion from the dataset of cases containing any missing values). It is well-known that this results in reduced estimation precision (or reduced statistical power), biased parameter estimates; however, the implications for model selection have not been explored. Here we employ an example from behavioural ecology to illustrate how missing data can affect the conclusions drawn from model selection or based on hypothesis testing. We show how missing observations can be recovered to give accurate estimates for IT-related indices (e.g. AIC and Akaike weight) as well as parameters (and their standard errors) by utilizing ‘multiple imputation’. We use this paper to illustrate key concepts from missing data theory and as a basis for discussing available methods for handling missing data. The example is intended to serve as a practically oriented case study for behavioural ecologists deciding on how to handle missing data in their own datasets and also as a first attempt to consider the problems of conducting model selection and averaging in the presence of missing observations.  相似文献   

5.
Managing invaded ecosystems entails making decisions about control strategies in the face of scientific uncertainty and ecological stochasticity. Statistical tools such as model selection and Bayesian decision analysis can guide decision-making by estimating probabilities of outcomes under alternative management scenarios, but these tools have seldom been applied in invasion ecology. We illustrate the use of model selection and Bayesian methods in a case study of smooth cordgrass (Spartina alterniflora) invading Willapa Bay, Washington. To address uncertainty in model structure, we quantified the weight of evidence for two previously proposed hypotheses, that S. alterniflora recruitment varies with climatic conditions (represented by sea surface temperature) and that recruitment is subject to an Allee effect due to pollen limitation. By fitting models to time series data, we found strong support for climate effects, with higher per capita seedling production in warmer years, but no evidence for an Allee effect based on either the total area invaded or the mean distance between neighboring clones. We used the best-supported model to compare alternative control strategies, incorporating uncertainty in parameter estimates and population dynamics. For a fixed annual removal effort, the probability of eradication in 10 years was highest, and final invaded area lowest, if removals targeted the smallest clones rather than the largest or randomly selected clones. The relationship between removal effort and probability of eradication was highly nonlinear, with a sharp threshold separating -0% and -100% probability of success, and this threshold was 95% lower in simulations beginning early rather than late in the invasion. This advantage of a rapid response strategy is due to density-dependent population growth, which produces alternative stable equilibria depending on the initial invasion size when control begins. Our approach could be applied to a wide range of invasive species management problems where appropriate data are available.  相似文献   

6.
This paper presents a multiple-pattern parameter identification and uncertainty analysis approach for robust water quality modeling using a neural network (NN) embedded genetic algorithm (GA). The modeling approach uses an adaptive NN–GA framework to inversely solve the governing equations in a water quality model for multiple parameter patterns, along with an alternating fitness method to maintain solution diversity. The procedure was demonstrated through a coupled 2D hydrodynamic and eutrophication model for Loch Raven Reservoir in Maryland. The inverse problem was formulated as a nonlinear optimization problem minimizing the degree of misfit (DOM) between model results and observed data. A set of NN models was developed to approximate the input-output response relationship of the Loch Raven Reservoir model and was incorporated into a GA framework in an adaptive fashion to search for near-optimal solutions minimizing the DOM. The numerical example showed that the adaptive NN–GA approach is capable of identifying multiple parameter patterns that reproduce the observed data equally well. The resulting parameter patterns were incorporated into the numerical model, and a multiple-pattern robust water quality modeling analysis, along with a compound margin of safety (CMOS) method, was proposed and applied to analyze the parameter pattern uncertainty.  相似文献   

7.
Indices of biotic integrity have become an established tool to quantify the condition of small non-tidal streams and their watersheds. To investigate the effects of watershed characteristics on stream biological condition, we present a new technique for regressing IBIs on watershed-specific explanatory variables. Since IBIs are typically evaluated on an ordinal scale, our method is based on the proportional odds model for ordinal outcomes. To avoid overfitting, we do not use classical maximum likelihood estimation but a component-wise functional gradient boosting approach. Because component-wise gradient boosting has an intrinsic mechanism for variable selection and model choice, determinants of biotic integrity can be identified. In addition, the method offers a relatively simple way to account for spatial correlation in ecological data. An analysis of the Maryland Biological Streams Survey shows that nonlinear effects of predictor variables on stream condition can be quantified while, in addition, accurate predictions of biological condition at unsurveyed locations are obtained.  相似文献   

8.
This paper introduces a flexible skewed link function for modeling ordinal response data with covariates based on the generalized extreme value (GEV) distribution. Commonly used probit, logit and complementary log-log links are prone to link misspecification because of their fixed skewness. The GEV link is flexible in fitting the skewness in the response curve with a free shape parameter. Using Bayesian methodology, it automatically detects the skewness in the response curve along with the model fitting. The flexibility of the proposed model is illustrated by its application to an ecological survey data about the coverage of Berberis thunbergii in New England. We employ the latent variable approach by Albert and Chib (J Am Stat Assoc 88:669–679, (1993) to develop computational schemes. For model selection, we employ the Deviance Information Criterion (DIC).  相似文献   

9.
Link WA  Barker RJ 《Ecology》2006,87(10):2626-2635
Statistical thinking in wildlife biology and ecology has been profoundly influenced by the introduction of AIC (Akaike's information criterion) as a tool for model selection and as a basis for model averaging. In this paper, we advocate the Bayesian paradigm as a broader framework for multimodel inference, one in which model averaging and model selection are naturally linked, and in which the performance of AIC-based tools is naturally evaluated. Prior model weights implicitly associated with the use of AIC are seen to highly favor complex models: in some cases, all but the most highly parameterized models in the model set are virtually ignored a priori. We suggest the usefulness of the weighted BIC (Bayesian information criterion) as a computationally simple alternative to AIC, based on explicit selection of prior model probabilities rather than acceptance of default priors associated with AIC. We note, however, that both procedures are only approximate to the use of exact Bayes factors. We discuss and illustrate technical difficulties associated with Bayes factors, and suggest approaches to avoiding these difficulties in the context of model selection for a logistic regression. Our example highlights the predisposition of AIC weighting to favor complex models and suggests a need for caution in using the BIC for computing approximate posterior model weights.  相似文献   

10.
Inverse parameter estimation of individual-based models (IBMs) is a research area which is still in its infancy, in a context where conventional statistical methods are not well suited to confront this type of models with data. In this paper, we propose an original evolutionary algorithm which is designed for the calibration of complex IBMs, i.e. characterized by high stochasticity, parameter uncertainty and numerous non-linear interactions between parameters and model output. Our algorithm corresponds to a variant of the population-based incremental learning (PBIL) genetic algorithm, with a specific “optimal individual” operator. The method is presented in detail and applied to the individual-based model OSMOSE. The performance of the algorithm is evaluated and estimated parameters are compared with an independent manual calibration. The results show that automated and convergent methods for inverse parameter estimation are a significant improvement to existing ad hoc methods for the calibration of IBMs.  相似文献   

11.
The Eastern Arc Mountains (EAMs) of Tanzania and Kenya support some of the most ancient tropical rainforest on Earth. The forests are a global priority for biodiversity conservation and provide vital resources to the Tanzanian population. Here, we make a first attempt to predict the spatial distribution of 40 EAM tree species, using generalised additive models, plot data and environmental predictor maps at sub 1 km resolution. The results of three modelling experiments are presented, investigating predictions obtained by (1) two different procedures for the stepwise selection of predictors, (2) down-weighting absence data, and (3) incorporating an autocovariate term to describe fine-scale spatial aggregation. In response to recent concerns regarding the extrapolation of model predictions beyond the restricted environmental range of training data, we also demonstrate a novel graphical tool for quantifying envelope uncertainty in restricted range niche-based models (envelope uncertainty maps). We find that even for species with very few documented occurrences useful estimates of distribution can be achieved. Initiating selection with a null model is found to be useful for explanatory purposes, while beginning with a full predictor set can over-fit the data. We show that a simple multimodel average of these two best-model predictions yields a superior compromise between generality and precision (parsimony). Down-weighting absences shifts the balance of errors in favour of higher sensitivity, reducing the number of serious mistakes (i.e., falsely predicted absences); however, response functions are more complex, exacerbating uncertainty in larger models. Spatial autocovariates help describe fine-scale patterns of occurrence and significantly improve explained deviance, though if important environmental constraints are omitted then model stability and explanatory power can be compromised. We conclude that the best modelling practice is contingent both on the intentions of the analyst (explanation or prediction) and on the quality of distribution data; generalised additive models have potential to provide valuable information for conservation in the EAMs, but methods must be carefully considered, particularly if occurrence data are scarce. Full results and details of all species models are supplied in an online Appendix.  相似文献   

12.
Markov Chain Monte Carlo on optimal adaptive sampling selections   总被引:1,自引:0,他引:1  
Under a Bayesian population model with a given prior distribution, the optimal sampling strategy with a fixed sample size n is an n-phase adaptive one. That is, the selection of the next sampling units should sequentially depend on the information obtained from the previously selected units, including the observed values of interest. Such an optimal strategy is in general not executable in practice due to its intensive computation. In many survey sampling situations, an important problem is that one would like to select a set of units in addition to a certain number of sampling units which have been observed. If the optimal strategy is an adaptive one, the selection of the additional units should take both the labels and the observed values of the already selected units into account. Hence, a simpler optimal two-phase adaptive sampling strategy under a Bayesian population model is proposed in this article for practical interest. A Markov chain Monte Carlo method is used to approximate the posterior joint distribution of the unobserved population units after the first phase sampling, for the optimal selection of the second phase sample. This approximation method is found to be successful to select the optimal second-phase sample. Finally, this optimal strategy is applied to a set of data from a study of geothermal CO2 emissions in Yellowstone National Park as a practical illustrative example.  相似文献   

13.
Flight initiation distance (FID), the distance at which an organism begins to flee an approaching threat, is an important component of antipredator behavior and a potential indicator of an animal’s perception of threat. In a field study on parrotfishes, we tested the predictions that FID in response to a diver will increase with body size, a correlate of reproductive value, and with experience of threat from humans. We studied a broad size range in four species on fringing reefs inside and outside the Barbados Marine Reserve. We used the Akaike's Information Criterion modified for small sample sizes (AICc) and model averaging to select and assess alternative models. Body size, reserve protection, and distance to a refuge, but not species, had strong support in explaining FID. FID increased with body size and generally remained two to ten times fish total length. FID was greater outside the reserve, especially in larger fish. Although we were not able to completely rule out other effects of size or reserve, this study supports predictions of an increase in FID with reproductive value and threat from humans.  相似文献   

14.
Spatial concurrent linear models, in which the model coefficients are spatial processes varying at a local level, are flexible and useful tools for analyzing spatial data. One approach places stationary Gaussian process priors on the spatial processes, but in applications the data may display strong nonstationary patterns. In this article, we propose a Bayesian variable selection approach based on wavelet tools to address this problem. The proposed approach does not involve any stationarity assumptions on the priors, and instead we impose a mixture prior directly on each wavelet coefficient. We introduce an option to control the priors such that high resolution coefficients are more likely to be zero. Computationally efficient MCMC procedures are provided to address posterior sampling, and uncertainty in the estimation is assessed through posterior means and standard deviations. Examples based on simulated data demonstrate the estimation accuracy and advantages of the proposed method. We also illustrate the performance of the proposed method for real data obtained through remote sensing.  相似文献   

15.
A new mathematical dose-response model for the expected probability of toxic response and also for the expected measure of the overdispersion parameter for the reproductive and developmental risk assessment is proposed. The model for the expected probability of toxic response is an improvised Weibull dose-response model incorporating the litter-size effect while the model for the overdispersion parameter is a polynomial function of the dose level. A beta-binomial distribution for the number of offspring showing toxic responses in a litter satisfactorily accounts for the extra-binomial variation and the intralitter correlation of responses of these pups. Confidence limits for low-dose extrapolation are based on the asymptotic distribution of the likelihood ratio. The safe dose for human exposure is then calculated by simple linear extrapolation. The model for overdispersion allows us to obtain the estimates of the overdispersion parameter at these dosages. This was not possible in the earlier models. The proposed model is illustrated by an application to a study on the effect of exposure to diethylhexylphthalate in mice. The results are compared with those obtained by Chen and Kodell (1989) who have applied the simple Weibull dose-response model to the same data set.This paper was prepared with partial support from the United States Environmental Protection Agency under a Cooperative Agreement Number CR-815273. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

16.
This paper presents a statistical method for detecting distinct scales of pattern for mosaics of irregular patches, by means of perimeter–area relationships. Krummel et al. (1987) were the first to develop a method for detecting different scaling domains in a landscape of irregular patches, but this method requires investigator judgment and is not completely satisfying. Grossi et al. (2001) suggested a modification of Krummel's method in order to detect objectively the change points between different scaling domains. Their procedure is based on the selection of the best piecewise linear regression model using a set of statistical tests. Even though the change points were estimated, the null distributions used for testing purposes were those appropriate for known change points. The present paper investigates the effect that estimating the change points has on the underlying distribution theory. The procedure we suggest is based on the selection of the best piecewise linear regression model using a likelihood ratio (LR) test. Each segment of the piecewise linear model corresponds to a fractal domain. Breakpoints between different segments are unknown, so the piecewise linear models are non-linear. In this case, the frequency distribution of the LR statistic cannot be approximated by a chi-squared distribution. Instead, Monte Carlo simulation is used to obtain an empirical null distribution of the LR statistic. The suggested method is applied to three patch types (CORINE biotopes) located in the Val Baganza watershed of Italy.  相似文献   

17.
In this study we aimed to combine knowledge of the ecophysiology and genetics of European beech to assess the potential of this species to adapt to environmental change. Therefore, we performed field and experimental studies on the genetic and ecophysiological functioning of beech. This information was integrated through a coupled genetic–ecophysiological model for individual trees that was parameterized with information derived from our own studies or from the literature. Using the model, we evaluated the adaptive response of beech stands in two ways: firstly, through sensitivity analyses (of initial genetic diversity, pollen dispersal distance, heritability of selected phenotypic traits, and forest management, representing disturbances) and secondly, through the evaluation of the responses of phenotypic traits and their genetic diversity to four management regimes applied to 10 study plots distributed over Western Europe. The model results indicate that the interval between recruitment events strongly affects the rate of adaptive response, because selection is most severe during the early stages of forest development. Forest management regimes largely determine recruitment intervals and thereby the potential for adaptive responses. Forest management regimes also determine the number of mother trees that contribute to the next generation and thereby the genetic variation that is maintained. Consequently, undisturbed forests maintain the largest amount of genetic variation, as recruitment intervals approach the longevity of trees and many mother trees contribute to the next generation. However, undisturbed forests have the slowest adaptive response, for the same reasons.Gene flow through pollen dispersal may compensate for the loss in genetic diversity brought about by selection. The sensitivity analysis showed that the total genetic diversity of a 2 ha stand is not affected by gene flow if the pollen distance distribution is varied from highly left-skewed to almost flat. However, a stand with a prevailing short-distance gene flow has a more pronounced spatial genetic structure than stands with equal short- and long-distance gene flows. The build-up of a spatial genetic structure is also strongly determined by the recruitment interval. Overall, the modelling results indicate that European beech has high adaptive potential to environmental change if recruitment intervals are short and many mother trees contribute to the next generation.The findings have two implications for modelling studies on the impacts of climate change on forests. Firstly: it cannot be taken for granted that parameter values remain constant over a time horizon of even a few generations – this is particularly important for threshold values subject to strong selection, like budburst, frost hardiness, drought tolerance, as used in species area models. Secondly: forest management should be taken into account in future assessments, as management affects the rate of adaptive response and thereby the response on trees and forests to environmental change, and because few forests are unmanaged. We conclude that a coupled ecophysiological and quantitative genetic tree model is a useful tool for such studies.  相似文献   

18.
Statistical methods emphasizing formal hypothesis testing have dominated the analyses used by ecologists to gain insight from data. Here, we review alternatives to hypothesis testing including techniques for parameter estimation and model selection using likelihood and Bayesian techniques. These methods emphasize evaluation of weight of evidence for multiple hypotheses, multimodel inference, and use of prior information in analysis. We provide a tutorial for maximum likelihood estimation of model parameters and model selection using information theoretics, including a brief treatment of procedures for model comparison, model averaging, and use of data from multiple sources. We discuss the advantages of likelihood estimation, Bayesian analysis, and meta-analysis as ways to accumulate understanding across multiple studies. These statistical methods hold promise for new insight in ecology by encouraging thoughtful model building as part of inquiry, providing a unified framework for the empirical analysis of theoretical models, and by facilitating the formal accumulation of evidence bearing on fundamental questions.  相似文献   

19.
Abstract:   In conservation biology, uncertainty about the choice of a statistical model is rarely considered. Model-selection uncertainty occurs whenever one model is chosen over plausible alternative models to represent understanding about a process and to make predictions about future observations. The standard approach to representing prediction uncertainty involves the calculation of prediction (or confidence) intervals that incorporate uncertainty about parameter estimates contingent on the choice of a "best" model chosen to represent truth. However, this approach to prediction based on statistical models tends to ignore model-selection uncertainty, resulting in overconfident predictions. Bayesian model averaging (BMA) has been promoted in a range of disciplines as a simple means of incorporating model-selection uncertainty into statistical inference and prediction. Bayesian model averaging also provides a formal framework for incorporating prior knowledge about the process being modeled. We provide an example of the application of BMA in modeling and predicting the spatial distribution of an arboreal marsupial in the Eden region of southeastern Australia. Other approaches to estimating prediction uncertainty are discussed.  相似文献   

20.
Hofner B  Müller J  Hothorn T 《Ecology》2011,92(10):1895-1901
Flexible modeling frameworks for species distribution models based on generalized additive models that allow for smooth, nonlinear effects and interactions are of increasing importance in ecology. Commonly, the flexibility of such smooth function estimates is controlled by means of penalized estimation procedures. However, the actual shape remains unspecified. In many applications, this is not desirable as researchers have a priori assumptions on the shape of the estimated effects, with monotonicity being the most important. Here we demonstrate how monotonicity constraints can be incorporated in a recently proposed flexible framework for species distribution models. Our proposal allows monotonicity constraints to be imposed on smooth effects and on ordinal, categorical variables using an additional asymmetric L2 penalty. Model estimation and variable selection for Red Kite (Milvus milvus) breeding was conducted using the flexible boosting framework implemented in R package mboost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号