首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
Link WA  Barker RJ 《Ecology》2006,87(10):2626-2635
Statistical thinking in wildlife biology and ecology has been profoundly influenced by the introduction of AIC (Akaike's information criterion) as a tool for model selection and as a basis for model averaging. In this paper, we advocate the Bayesian paradigm as a broader framework for multimodel inference, one in which model averaging and model selection are naturally linked, and in which the performance of AIC-based tools is naturally evaluated. Prior model weights implicitly associated with the use of AIC are seen to highly favor complex models: in some cases, all but the most highly parameterized models in the model set are virtually ignored a priori. We suggest the usefulness of the weighted BIC (Bayesian information criterion) as a computationally simple alternative to AIC, based on explicit selection of prior model probabilities rather than acceptance of default priors associated with AIC. We note, however, that both procedures are only approximate to the use of exact Bayes factors. We discuss and illustrate technical difficulties associated with Bayes factors, and suggest approaches to avoiding these difficulties in the context of model selection for a logistic regression. Our example highlights the predisposition of AIC weighting to favor complex models and suggests a need for caution in using the BIC for computing approximate posterior model weights.  相似文献   

2.
In the statistical modeling of a biological or ecological phenomenon, selecting an optimal model among a collection of candidates is a critical issue. To identify an optimal candidate model, a number of model selection criteria have been developed and investigated based on estimating Kullback’s (Information theory and statistics. Dover, Mineola, 1968) directed or symmetric divergence. Criteria that target the directed divergence include the Akaike (2nd international symposium on information theory. Akadémia Kiadó, Budapest, Hungary, pp 267–281, 1973, IEEE Trans Autom Control AC 19:716–723, 1974) information criterion, AIC, and the “corrected” Akaike information criterion (Hurvich and Tsai in Biometrika 76:297–307, 1989), AICc; criteria that target the symmetric divergence include the Kullback information criterion, KIC, and the “corrected” Kullback information criterion, KICc (Cavanaugh in Stat Probab Lett 42:333–343, 1999; Aust N Z J Stat 46:257–274, 2004). For overdispersed count data, simple modifications of AIC and AICc have been increasingly utilized: specifically, the quasi Akaike information criterion, QAIC, and its corrected version, QAICc (Lebreton et al. in Ecol Monogr 62(1):67–118 1992). In this paper, we propose analogues of QAIC and QAICc based on estimating the symmetric as opposed to the directed divergence: QKIC and QKICc. We evaluate the selection performance of AIC, AICc, QAIC, QAICc, KIC, KICc, QKIC, and QKICc in a simulation study, and illustrate their practical utility in an ecological application. In our application, we use the criteria to formulate statistical models of the tick (Dermacentor variabilis) load on a white-footed mouse (Peromyscus leucopus) in northern Missouri.  相似文献   

3.
The growth pattern of Loxechinus albus in southern Chile was studied using size-at-age data obtained by reading growth bands on the genital plates. The scatter plots of sizes-at-age for samples collected in three different locations indicated that growth is linear between ages 2 and 10. Five different growth models, including linear, asymptotic and non-asymptotic functions, were fitted to the data, and model selection was conducted based on the Akaike information criteria (AIC) and the Bayesian information criteria (BIC). The AIC identified the Tanaka model as the most suitable for two of the three sites. However, the BIC led to the selection of the linear model for all zones. Our results show that the growth pattern of L. albus is different from the predominantly asymptotic pattern that has been reported for other sea urchin species.  相似文献   

4.
Wildlife sampling for habitat selection often combines a random background sample with a random sample of used sites, because the background sample could contain too few used sites to be informative for rare species. This approach is referred to as use-availability sampling. Two variants are considered where there is: (1) a random background sample including used and unused sites augmented with a sample of used sites, and (2) a sample of used sites augmented with a contaminated background sample, i.e. use is not recorded. A weighted estimator first proposed by Manski and Lerman (Econometrica 45(8):1977?C1988, 1977) forms the basis for our suggested approach. The weighted estimator has been shown to perform better than the usual unweighted approach with uncontaminated data and mis-specified logit models (Xie and Manski in Sociol Methods Res 17(3):283?C302, 1989). A weighted EM algorithm is developed for use with contaminated background data. We show that the weighted estimator continues to perform well with contaminated data and maintains its robustness to model mis-specification. The weighted estimator has not been previously used for use-availability sampling due to reliance on the assumption that only the intercept is biased, which is valid for a correct logit model. We show that adjusting the intercept may not eliminate the bias with an incorrect logit model. In this case, the weighted estimator is a relatively simple and effective alternative.  相似文献   

5.
Selecting a binary Markov model for a precipitation process   总被引:1,自引:0,他引:1  
This paper uses rth-order categorical Markov chains to model the probability of precipitation. Several stationary and non-stationary high-order Markov models are proposed and compared using BIC. The number of parameters increases exponentially by adding the Markov order. Several classes of high-order Markov models are proposed which their increase of number of parameters are modest. For example models that use the number of precipitation days in a period prior to date, temperature of the previous day and sines/cosines periodic functions (to model the seasonality) are considered. The theory of partial likelihood is used to estimate the parameters. Parsimonious non-stationary first order Markov models with few seasonal terms are found optimal using BIC and temperature does not turn out to be a useful covariate. However BIC seems to underestimate the number of seasonal terms. We have also compared the results with AIC in some cases which tends to pick parsimonious models with more seasonal terms and higher order. We also show that ignoring seasonal terms result in picking higher order Markov chains. Finally we apply the methods to build confidence intervals for the probability of periods with no precipitation or low number of precipitation days in Calgary using historical data from 1980 to 2000.  相似文献   

6.
Atmospheric carbon dioxide concentration (ACDC) level is an important factor for predicting temperature and climate changes. We analyze the conditional variance of a function of ACDC level known as ACDC level growth rate (ACDCGR) using the generalised autoregressive conditional heteroskedasticity (GARCH) and GARCH models with leverage effect. The data are a subset of the well known Mauna Loa atmosphere carbon dioxide record. We test for the presence of stylized facts in the ACDCGR time series. The performance of GARCH models are compared to EGARCH, TGARCH and PGARCH models. Model fit measures AIC, BIC and likelihood is calculated for each fitted model. The results do confirm the presence of some of important stylized facts in the ACDCGR time series, but the presence of leverage effect is not significant . The out of sample one step ahead forecasting performances of the models based on RMSE and MAE metrics are evaluated. EGARCH model with student $t$ disturbances showed the best fit and a valid forecasting performance. A bootstrap algorithm is employed to calculate confidence intervals for future values of ACDCGR time series and its volatility. The constructed bootstrap confidence intervals showed a reasonable performance.  相似文献   

7.
Thompson (1990) introduced the adaptive cluster sampling design and developed two unbiased estimators, the modified Horvitz-Thompson (HT) and Hansen-Hurwitz (HH) estimators, for this sampling design and noticed that these estimators are not a function of the minimal sufficient statistics. He applied the Rao-Blackwell theorem to improve them. Despite having smaller variances, these latter estimators have not received attention because a suitable method or algorithm for computing them was not available. In this paper we obtain closed forms of the Rao-Blackwell versions which can easily be computed. We also show that the variance reduction for the HH estimator is greater than that for the HT estimator using Rao-Blackwell versions. When the condition for extra samples is 0$$ " align="middle" border="0"> , one can expect some Rao-Blackwell improvement in the HH estimator but not in the HT estimator. Two examples are given.  相似文献   

8.
Practical considerations often motivate employing variable probability sampling designs when estimating characteristics of forest populations. Three distribution function estimators, the Horvitz-Thompson estimator, a difference estimator, and a ratio estimator, are compared following variable probability sampling in which the inclusion probabilities are proportional to an auxiliary variable, X. Relative performance of the estimators is affected by several factors, including the distribution of the inclusion probabilities, the correlation () between X and the response Y, and the position along the distribution function being estimated. Both the ratio and difference estimators are superior to the Horvitz-Thompson estimator. The difference estimator gains better precision than the ratio estimator toward the upper portion of the distribution function, but the ratio estimator is superior toward the lower end of the distribution function. The point along the distribution function at which the difference estimator becomes more precise than the ratio estimator depends on the sampling design, as well as the coefficient of variation of X and . A simple confidence interval procedure provides close to nominal coverage for intervals constructed from both the difference and ratio estimators, with the exception that coverage may be poor for the lower tail of the distribution function when using the ratio estimator.  相似文献   

9.
Model averaging, specifically information theoretic approaches based on Akaike’s information criterion (IT-AIC approaches), has had a major influence on statistical practices in the field of ecology and evolution. However, a neglected issue is that in common with most other model fitting approaches, IT-AIC methods are sensitive to the presence of missing observations. The commonest way of handling missing data is the complete-case analysis (the complete deletion from the dataset of cases containing any missing values). It is well-known that this results in reduced estimation precision (or reduced statistical power), biased parameter estimates; however, the implications for model selection have not been explored. Here we employ an example from behavioural ecology to illustrate how missing data can affect the conclusions drawn from model selection or based on hypothesis testing. We show how missing observations can be recovered to give accurate estimates for IT-related indices (e.g. AIC and Akaike weight) as well as parameters (and their standard errors) by utilizing ‘multiple imputation’. We use this paper to illustrate key concepts from missing data theory and as a basis for discussing available methods for handling missing data. The example is intended to serve as a practically oriented case study for behavioural ecologists deciding on how to handle missing data in their own datasets and also as a first attempt to consider the problems of conducting model selection and averaging in the presence of missing observations.  相似文献   

10.
An estimating function approach to the inference of catch-effort models   总被引:1,自引:0,他引:1  
A class of catch-effort models, which allows for heterogeneous removal probabilities, is proposed for closed populations. The model includes three types of removal probabilities: multiplicative, Poisson and logistic. The usual removal and generalized removal models then become special cases. The equivalence of the proposed model and a special type of capture-recapture model is discussed. A unified estimating function approach is used to estimate the initial population size. For the homogeneous model, the resulting population size estimator based on optimal estimating functions is asymptotically equivalent to the maximum likelihood estimator. One advantage for our approach is that it can be extended to handle the heterogeneous populations in which the maximum likelihood estimators do not exist. The bootstrap method is applied to construct variance estimators and confidence intervals. We illustrate the method by two real data examples. Results of a simulation study investigating the performance of the proposed estimation procedure are presented.  相似文献   

11.
A fundamental challenge to estimating population size with mark-recapture methods is heterogeneous capture probabilities and subsequent bias of population estimates. Confronting this problem usually requires substantial sampling effort that can be difficult to achieve for some species, such as carnivores. We developed a methodology that uses two data sources to deal with heterogeneity and applied this to DNA mark-recapture data from grizzly bears (Ursus arctos). We improved population estimates by incorporating additional DNA "captures" of grizzly bears obtained by collecting hair from unbaited bear rub trees concurrently with baited, grid-based, hair snag sampling. We consider a Lincoln-Petersen estimator with hair snag captures as the initial session and rub tree captures as the recapture session and develop an estimator in program MARK that treats hair snag and rub tree samples as successive sessions. Using empirical data from a large-scale project in the greater Glacier National Park, Montana, USA, area and simulation modeling we evaluate these methods and compare the results to hair-snag-only estimates. Empirical results indicate that, compared with hair-snag-only data, the joint hair-snag-rub-tree methods produce similar but more precise estimates if capture and recapture rates are reasonably high for both methods. Simulation results suggest that estimators are potentially affected by correlation of capture probabilities between sample types in the presence of heterogeneity. Overall, closed population Huggins-Pledger estimators showed the highest precision and were most robust to sparse data, heterogeneity, and capture probability correlation among sampling types. Results also indicate that these estimators can be used when a segment of the population has zero capture probability for one of the methods. We propose that this general methodology may be useful for other species in which mark-recapture data are available from multiple sources.  相似文献   

12.
A presence–absence map consists of indicators of the occurrence or nonoccurrence of a given species in each cell over a grid, without counting the number of individuals in a cell once it is known it is occupied. They are commonly used to estimate the distribution of a species, but our interest is in using these data to estimate the abundance of the species. In practice, certain types of species (in particular flora types) may be spatially clustered. For example, some plant communities will naturally group together according to similar environmental characteristics within a given area. To estimate abundance, we develop an approach based on clustered negative binomial models with unknown cluster sizes. Our approach uses working clusters of cells to construct an estimator which we show is consistent. We also introduce a new concept called super-clustering used to estimate components of the standard errors and interval estimators. A simulation study is conducted to examine the performance of the estimators and they are applied to real data.  相似文献   

13.
Modeling empirical distributions of repeated counts with parametric probability distributions is a frequent problem when studying species abundance. One must choose a family of distributions which is flexible enough to take into account very diverse patterns and possess parameters with clear biological/ecological interpretations. The negative binomial distribution fulfills these criteria and was selected for modeling counts of marine fish and invertebrates. This distribution depends on a vector \(\left( K,\mathfrak {P}\right) \) of parameters, and ranges from the Poisson distribution (when \(K\rightarrow +\infty \)) to Fisher’s log-series, when \(K\rightarrow 0\). Moreover, these parameters have biological/ecological interpretations which are detailed in the literature and in this study. We compared three estimators of K, \(\mathfrak {P}\) and the parameter \(\alpha \) of Fisher’s log-series, following the work of Rao CR (Statistical ecology. Pennsylvania State University Press, University Park, 1971) on a three-parameter unstandardized variant of the negative binomial distribution. We further investigated the coherence underlying parameter values resulting from the different estimators, using both real count data collected in the Mauritanian Exclusive Economic Zone (MEEZ) during the period 1987–2010 and realistic simulations of these data. In the case of the MEEZ, we first built homogeneous lists of counts (replicates), by gathering observations of each species with respect to “typical environments” obtained by clustering the sampled stations. The best estimation of \(\left( K,\mathfrak {P}\right) \) was generally obtained by penalized minimum Hellinger distance estimation. Interestingly, the parameters of most of the correctly sampled species seem compatible with the classical birth-and-dead model of population growth with immigration by Kendall (Biometrika 35:6–15, 1948).  相似文献   

14.

Background

Semi-natural plant communities such as field boundaries play an important ecological role in agricultural landscapes, e.g., provision of refuge for plant and other species, food web support or habitat connectivity. To prevent undesired effects of herbicide applications on these communities and their structure, the registration and application are regulated by risk assessment schemes in many industrialized countries. Standardized individual-level greenhouse experiments are conducted on a selection of crop and wild plant species to characterize the effects of herbicide loads potentially reaching off-field areas on non-target plants. Uncertainties regarding the protectiveness of such approaches to risk assessment might be addressed by assessment factors that are often under discussion. As an alternative approach, plant community models can be used to predict potential effects on plant communities of interest based on extrapolation of the individual-level effects measured in the standardized greenhouse experiments. In this study, we analyzed the reliability and adequacy of the plant community model IBC-grass (individual-based plant community model for grasslands) by comparing model predictions with empirically measured effects at the plant community level.

Results

We showed that the effects predicted by the model IBC-grass were in accordance with the empirical data. Based on the species-specific dose responses (calculated from empirical effects in monocultures measured 4 weeks after application), the model was able to realistically predict short-term herbicide impacts on communities when compared to empirical data.

Conclusion

The results presented in this study demonstrate an approach how the current standard greenhouse experiments—measuring herbicide impacts on individual-level—can be coupled with the model IBC-grass to estimate effects on plant community level. In this way, it can be used as a tool in ecological risk assessment.
  相似文献   

15.
Coverage, i.e., the area covered by the target attribute in the study region, is a key parameter in many surveys. Coverage estimation is usually performed by adopting a replicated protocol based on line-intercept sampling coupled with a suitable linear homogeneous estimator. Since coverage is a parameter which may be interestingly represented as the integral of a suitable function, improved Monte Carlo strategies for implementing the replicated protocol are introduced in order to achieve estimators with small variance rates. In addition, new specific theoretical results on Monte Carlo integration methods are given to deal with the integrand functions arising in the special coverage estimation setting.
Lucio BarabesiEmail:
  相似文献   

16.
Heteroscedastic additive and multiplicative models are proposed to disaggregate household data on water consumption from Athens and provide individual consumption estimates. The models adjust for heteroscedasticity assuming that variances relate to covariates. Household characteristics that can influence consumption are also included into models in order to allow for a clearer measurement of individual characteristics effects. Estimation is accomplished through a penalized least squares approach. The method is applied to a sample of real data related to domestic water consumption in Athens. The results show a greater consumption of water for males while the single-female households are these that use the lowest quantities of water. The consumption curves by age and gender are constructed presenting differences between the two sexes.
Vassilis G. S. VasdekisEmail:
  相似文献   

17.
In geostatistics, both kriging and smoothing splines are commonly used to generate an interpolated map of a quantity of interest. The geoadditive model proposed by Kammann and Wand (J R Stat Soc: Ser C (Appl Stat) 52(1):1–18, 2003) represents a fusion of kriging and penalized spline additive models. Complex data issues, including non-linear covariate trends, multiple measurements at a location and clustered observations are easily handled using the geoadditive model. We propose a likelihood based estimation procedure that enables the estimation of the range (spatial decay) parameter associated with the penalized splines of the spatial component in the geoadditive model. We present how the spatial covariance structure (covariogram) can be derived from the geoadditive model. In a simulation study, we show that the underlying spatial process and prediction of the spatial map are estimated well using the proposed likelihood based estimation procedure. We present several applications of the proposed methods on real-life data examples.  相似文献   

18.
Space-time data are ubiquitous in the environmental sciences. Often, as is the case with atmo- spheric and oceanographic processes, these data contain many different scales of spatial and temporal variability. Such data are often non-stationary in space and time and may involve many observation/prediction locations. These factors can limit the effectiveness of traditional space- time statistical models and methods. In this article, we propose the use of hierarchical space-time models to achieve more flexible models and methods for the analysis of environmental data distributed in space and time. The first stage of the hierarchical model specifies a measurement- error process for the observational data in terms of some 'state' process. The second stage allows for site-specific time series models for this state variable. This stage includes large-scale (e.g. seasonal) variability plus a space-time dynamic process for the anomalies'. Much of our interest is with this anomaly proc ess. In the third stage, the parameters of these time series models, which are distributed in space, are themselves given a joint distribution with spatial dependence (Markov random fields). The Bayesian formulation is completed in the last two stages by speci- fying priors on parameters. We implement the model in a Markov chain Monte Carlo framework and apply it to an atmospheric data set of monthly maximum temperature.  相似文献   

19.
Missing covariate values in linear regression models can be an important problem facing environmental researchers. Existing missing value treatment methods such as Multiple Imputation (MI), the EM algorithm and Data Augmentation (DA) have the assumption that both observed and unobserved data come from the same distribution, most commonly a multivariate normal or a conditionally multivariate normal family. These methods do try to incorporate the missing data mechanism and rely on the assumption of Missing At Random (MAR). We present a DA method which does not rely on the MAR assumption and can model missing data mechanisms and covariate structure. This method utilizes the Gibbs Sampler as a tool for incorporating these structures and mechanisms. We apply this method to an ecological data set that relates fish condition to environmental variables. Notice that the presented DA method detects relationships that are not detected when other missing data methods are employed.
Edward L. BooneEmail:
  相似文献   

20.
Benchmark calculations often are made from data extracted from publications. Such data may not be in a form most appropriate for benchmark analysis, and, as a result, suboptimal and/or non-standard benchmark analyses are often applied. This problem can be mitigated in some cases using Monte Carlo computational methods that allow the likelihood of the published data to be calculated while still using an appropriate benchmark dose (BMD) definition. Such an approach is illustrated herein using data from a study of workers exposed to styrene, in which a hybrid BMD calculation is implemented from dose response data reported only as means and standard deviations of ratios of scores on neuropsychological tests from exposed subjects to corresponding scores from matched controls. The likelihood of the data is computed using a combination of analytic and Monte Carlo integration methods.
Kenny S. CrumpEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号