首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Traditional occupancy–abundance and abundance–variance–occupancy models do not take into account zero-inflation, which occurs when sampling rare species or in correlated counts arising from repeated measures. In this paper we propose a novel approach extending occupancy–abundance relationships to zero-inflated count data. This approach involves three steps: (1) selecting distributional assumptions and parsimonious models for the count data, (2) estimating abundance, occupancy and variance parameters as functions of site- and/or time-specific covariates, and (3) modelling the occupancy–abundance relationship using the parameters estimated in step 2. Five count datasets were used for comparing standard Poisson and negative binomial distribution (NBD) occupancy–abundance models. Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) occupancy–abundance models were introduced for the first time, and these were compared with the Poisson, NBD, He and Gaston's and Wilson and Room's abundance–variance–occupancy models. The percentage of zero counts ranged from 45 to 80% in the datasets analysed. For most of the datasets, the ZINB occupancy–abundance model performed better than the traditional Poisson, NBD and Wilson and Room's model. He and Gaston's model performed better than the ZINB in two out of the five datasets. However, the occupancy predicted by all models increased faster than the observed as density increased resulting in significant mismatch at the highest densities. Limitations of the various models are discussed, and the need for careful choice of count distributions and predictors in estimating abundance and occupancy parameter are indicated.  相似文献   

The arcsine is asinine: the analysis of proportions in ecology   总被引:3,自引:0,他引:3  
Warton DI  Hui FK 《Ecology》2011,92(1):3-10
The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.  相似文献   

Ecological studies investigate relationships at the level of the group, rather than at the level of the individual. Although such studies are a common design in epidemiology, it is well-known that estimates may be subject to ecological bias. Most discussion of ecological bias has focused on rare disease events, where the tractability of the loglinear model allows some characterization of the nature of different biases. This paper concentrates on non-rare events, where the Poisson approximation to the binomial distribution is not appropriate. We limit the discussion to bias that arises from within-area variability in exposures and confounders. Our aims are to investigate the likely sizes and directions of bias and, where possible, to suggest methods for controlling the bias or for addressing the sensitivity of inference to assumptions on the nature of the bias. We illustrate that for non-rare events it is much more difficult to characterize the direction of bias than in the rare case. A series of simple numerical examples based on a chronic study of respiratory health illustrate the ideas of the paper.  相似文献   

Iwao's quadratic regression or Taylor's Power Law (TPL) are commonly used to model the variance as a function of the mean for sample counts of insect populations which exhibit spatial aggregation. The modeled variance and distribution of the mean are typically used in pest management programs to decide if the population is above the action threshold in any management unit (MU) (e.g., orchard, forest compartment). For nested or multi-level sampling the usual two-stage modeling procedure first obtains the sample variance for each MU and sampling level using ANOVA and then fits a regression of variance on the mean for each level using either Iwao or TPL variance models. Here this approach is compared to the single-stage procedure of fitting a generalized linear mixed model (GLMM) directly to the count data with both approaches demonstrated using 2-level sampling. GLMMs and additive GLMMs (AGLMMs) with conditional Poisson variance function as well as the extension to the negative binomial are described. Generalization to more than two sampling levels is outlined. Formulae for calculating optimal relative sample sizes (ORSS) and the operating characteristic curve for the control decision are given for each model. The ORSS are independent of the mean in the case of the AGLMMs. The application described is estimation of the variance of the mean number of leaves per shoot occupied by immature stages of a defoliator of eucalypts, the Tasmanian Eucalyptus leaf beetle, based on a sample of trees within plots from each forest compartment. Historical population monitoring data were fitted using the above approaches.  相似文献   

Efficient statistical mapping of avian count data   总被引:3,自引:0,他引:3  
We develop a spatial modeling framework for count data that is efficient to implement in high-dimensional prediction problems. We consider spectral parameterizations for the spatially varying mean of a Poisson model. The spectral parameterization of the spatial process is very computationally efficient, enabling effective estimation and prediction in large problems using Markov chain Monte Carlo techniques. We apply this model to creating avian relative abundance maps from North American Breeding Bird Survey (BBS) data. Variation in the ability of observers to count birds is modeled as spatially independent noise, resulting in over-dispersion relative to the Poisson assumption. This approach represents an improvement over existing approaches used for spatial modeling of BBS data which are either inefficient for continental scale modeling and prediction or fail to accommodate important distributional features of count data thus leading to inaccurate accounting of prediction uncertainty.  相似文献   

Bayesian methods incorporate prior knowledge into a statistical analysis. This prior knowledge is usually restricted to assumptions regarding the form of probability distributions of the parameters of interest, leaving their values to be determined mainly through the data. Here we show how a Bayesian approach can be applied to the problem of drawing inference regarding species abundance distributions and comparing diversity indices between sites. The classic log series and the lognormal models of relative- abundance distribution are apparently quite different in form. The first is a sampling distribution while the other is a model of abundance of the underlying population. Bayesian methods help unite these two models in a common framework. Markov chain Monte Carlo simulation can be used to fit both distributions as small hierarchical models with shared common assumptions. Sampling error can be assumed to follow a Poisson distribution. Species not found in a sample, but suspected to be present in the region or community of interest, can be given zero abundance. This not only simplifies the process of model fitting, but also provides a convenient way of calculating confidence intervals for diversity indices. The method is especially useful when a comparison of species diversity between sites with different sample sizes is the key motivation behind the research. We illustrate the potential of the approach using data on fruit-feeding butterflies in southern Mexico. We conclude that, once all assumptions have been made transparent, a single data set may provide support for the belief that diversity is negatively affected by anthropogenic forest disturbance. Bayesian methods help to apply theory regarding the distribution of abundance in ecological communities to applied conservation.  相似文献   

Ver Hoef JM  Boveng PL 《Ecology》2007,88(11):2766-2772
Quasi-Poisson and negative binomial regression models have equal numbers of parameters, and either could be used for overdispersed count data. While they often give similar results, there can be striking differences in estimating the effects of covariates. We explain when and why such differences occur. The variance of a quasi-Poisson model is a linear function of the mean while the variance of a negative binomial model is a quadratic function of the mean. These variance relationships affect the weights in the iteratively weighted least-squares algorithm of fitting models to data. Because the variance is a function of the mean, large and small counts get weighted differently in quasi-Poisson and negative binomial regression. We provide an example using harbor seal counts from aerial surveys. These counts are affected by date, time of day, and time relative to low tide. We present results on a data set that showed a dramatic difference on estimating abundance of harbor seals when using quasi-Poisson vs. negative binomial regression. This difference is described and explained in light of the different weighting used in each regression method. A general understanding of weighting can help ecologists choose between these two methods.  相似文献   

In ecological studies, researchers often try to convey the analysis results to individual level based on aggregate data. In order to do this correctly, the possibility of ecological bias should be studied and addressed. One of the key ideas used to address the ecological bias issue is to derive the ecological model from the individual model and to check whether the parameter of interest in the individual model is identifiable in the ecological model. However, the procedure depends on unverifiable assumptions, and we recommend checking how sensitive the results are to these unverifiable assumptions. We analyzed the tuberculosis data that was collected in Seoul in 2005 using a spatial ecological regression model for the aggregate count data with spatial correlation, and found that the deprivation index is likely to have a small positive effect on the occurrence risk of tuberculosis in individual level in Seoul. We considered this finding in various aspects by performing in depth sensitivity analyses. In particular, our findings are shown to be robust to the distribution assumptions for the individual exposure and missing binary covariate across various scenarios.  相似文献   

In this article, the mathematical assumptions of a number of commonly used ecological regression models are made explicit, critically assessed, and related to ecological bias. In particular, the role and interpretation of random effects models are examined. The modeling of spatial variability is considered and related to an underlying continuous spatial field. The examination of such a field with respect to the modeling of risk in relation to a point source highlights an inconsistency in commonly used approaches. A theme of the paper is to examine how plausible individual-level models relate to those used in practice at the aggregate level. The individual-level models acknowledge confounding, within-area variability in exposures and confounders, measurement error and data anomalies and so we can examine how the area-level versions consider these aspects. We briefly discuss designs that efficiently sample individual data and would appear to be useful in environmental settings.  相似文献   

Approaches to assess the impacts of landscape disturbance scenarios on species range from metrics based on patterns of occurrence or habitat to comprehensive models that explicitly include ecological processes. The choice of metrics and models affects how impacts are interpreted and conservation decisions. We explored the impacts of 3 realistic disturbance scenarios on 4 species with different ecological and taxonomic traits. We used progressively more complex models and metrics to evaluate relative impact and rank of scenarios on the species. Models ranged from species distribution models that relied on implicit assumptions about environmental factors and species presence to highly parameterized spatially explicit population models that explicitly included ecological processes and stochasticity. Metrics performed consistently in ranking different scenarios in order of severity primarily when variation in impact was driven by habitat amount. However, they differed in rank for cases where dispersal dynamics were critical in influencing metapopulation persistence. Impacts of scenarios on species with low dispersal ability were better characterized using models that explicitly captured these processes. Metapopulation capacity provided rank orders that most consistently correlated with those from highly parameterized and data-rich models and incorporated information about dispersal with little additional computational and data cost. Our results highlight the importance of explicitly considering species’ ecology, spatial configuration of habitat, and disturbance when choosing indicators of species persistence. We suggest using hybrid approaches that are a mixture of simple and complex models to improve multispecies assessments.  相似文献   

The statistical analysis of continuous data that is non-negative is a common task in quantitative ecology. An example, and our motivation, is the weight of a given fish species in a fish trawl. The analysis task is complicated by the occurrence of exactly zero observations. It makes many statistical methods for continuous data inappropriate. In this paper we propose a model that extends a Tweedie generalised linear model. The proposed model exploits the fact that a Tweedie distribution is equivalent to the distribution obtained by summing a Poisson number of gamma random variables. In the proposed model, both the number of gamma variates, and their average size, are modelled separately. The model has a composite link and has a flexible mean-variance relationship that can vary with covariates. We illustrate the model, and compare it to other models, using data from a fish trawl survey in south-east Australia.  相似文献   

Fisher (1950) introduced the variance or dispersion index test statistic to test deviations of the Poisson distribution. For this test approximate critical values exist for large sample sizes. If the number of observations is small this approximation can lead to a wrong conclusion. For small samples, the exact critical values can only be derived by enumeration of all possibilities. Tables of critical values for overdispersion already exist (e.g., Rao and Chakravarti, 1956) However, in many biological situations underdispersion, a more-regular-than-Poisson distribution, is a common phenomenon. Therefore, we have tabulated in this paper the one-tailed critical values for a small number of observations under the null hypothesis (H0) that the random variable is Poisson distributed against the alternative hypothesis of underdispersion. With the help of this table, the hypothesis that the observations in a data set are Poisson distributed, can be tested easily with the variance test. The tables are illustrated with examples from the literature and some observations from our own research. In general, the 2 approximation gives a smaller significance level than the exact variance test.  相似文献   

Roadkill is of ecological importance so that there is increasing academic research to understand the causes and patterns of roadkills and their impact on ecosystems. This work is motivated by the study on roadkills of endangered Bufo calamita (B. calamita) (The natterjack toad) out of amphibian roadkills. The status of B. calamita is regarded as unfavorable due to large population declines. In the mentioned study, B. calamita and total amphibian roadkills were recorded via distance sampling on a National Road of Southern Portugal between March 1995 and March 1997. The traditional binomial modeling of these data are challenged by three issues. First, the zeros in B. calamita counts far exceeded its nominal level. Second, there is likely serial correlation among observations along the road. Finally, there is varying number of total amphibian roadkills at each sampling location; therefore, there is likely randomness in the number of total amphibian roadkills. All these features may contribute to overdispersion in the binomial observations. These three issues are routinely addressed one at a time separately, the first through zero-inflated binomial models, the second, for example, by means of random effects models for serially correlated binomial data and the third by models for binomial data with random cluster sizes. Therefore the data cannot be adequately modeled by any of these separate models. In this paper, we propose a new model to tackle these three issues simultaneously in the binomial analysis of B. calamita roadkills out of amphibian roadkills. Our approach is generally applicable to other binomial data with these three features.  相似文献   

We derive some statistical properties of the distribution of two Negative Binomial random variables conditional on their total. This type of model can be appropriate for paired count data with Poisson over-dispersion such that the variance is a quadratic function of the mean. This statistical model is appropriate in many ecological applications including comparative fishing studies of two vessels and or gears. The parameter of interest is the ratio of pair means. We show that the conditional means and variances are different from the more commonly used Binomial model with variance adjusted for over-dispersion, or the Beta-Binomial model. The conditional Negative Binomial model is complicated because it does not eliminate nuisance parameters like in the Poisson case. Maximum likelihood estimation with the unconditional Negative Binomial model can result in biased estimates of the over-dispersion parameter and poor confidence intervals for the ratio of means when there are many nuisance parameters. We propose three approaches to deal with nuisance parameters in the conditional Negative Binomial model. We also study a random effects Binomial model for this type of data, and we develop an adjustment to the full-sample Negative Binomial profile likelihood to reduce the bias caused by nuisance parameters. We use simulations with these methods to examine bias, precision, and accuracy of estimators and confidence intervals. We conclude that the maximum likelihood method based on the full-sample Negative Binomial adjusted profile likelihood produces the best statistical inferences for the ratio of means when paired counts have Negative Binomial distributions. However, when there is uncertainty about the type of Poisson over-dispersion then a Binomial random effects model is a good choice.  相似文献   

Analysis of brood sex ratios: implications of offspring clustering   总被引:13,自引:0,他引:13  
Generalized linear models (GLMs) are increasingly used in modern statistical analyses of sex ratio variation because they are able to determine variable design effects on binary response data. However, in applying GLMs, authors frequently neglect the hierarchical structure of sex ratio data, thereby increasing the likelihood of committing 'type I' error. Here, we argue that whenever clustered (e.g., brood) sex ratios represent the desired level of statistical inference, the clustered data structure ought to be taken into account to avoid invalid conclusions. Neglecting the between-cluster variation and the finite number of clusters in determining test statistics, as implied by using likelihood ratio-based L2-statistics in conventional GLM, results in biased (usually overestimated) test statistics and pseudoreplication of the sample. Random variation in the sex ratio between clusters (broods) can often be accommodated by scaling residual binomial (error) variance for overdispersion, and using F-tests instead of L2-tests. More complex situations, however, require the use of generalized linear mixed models (GLMMs). By introducing higher-level random effects in addition to the residual error term, GLMMs allow an estimation of fixed effect and interaction parameters while accounting for random effects at different levels of the data. GLMMs are first required in sex ratio analyses whenever there are covariates at the offspring level of the data, but inferences are to be drawn at the brood level. Second, when interactions of effects at different levels of the data are to be estimated, random fluctuation of parameters can be taken into account only in GLMMs. Data structures requiring the use of GLMMs to avoid erroneous inferences are often encountered in ecological sex ratio studies.  相似文献   

A new mathematical dose-response model for the expected probability of toxic response and also for the expected measure of the overdispersion parameter for the reproductive and developmental risk assessment is proposed. The model for the expected probability of toxic response is an improvised Weibull dose-response model incorporating the litter-size effect while the model for the overdispersion parameter is a polynomial function of the dose level. A beta-binomial distribution for the number of offspring showing toxic responses in a litter satisfactorily accounts for the extra-binomial variation and the intralitter correlation of responses of these pups. Confidence limits for low-dose extrapolation are based on the asymptotic distribution of the likelihood ratio. The safe dose for human exposure is then calculated by simple linear extrapolation. The model for overdispersion allows us to obtain the estimates of the overdispersion parameter at these dosages. This was not possible in the earlier models. The proposed model is illustrated by an application to a study on the effect of exposure to diethylhexylphthalate in mice. The results are compared with those obtained by Chen and Kodell (1989) who have applied the simple Weibull dose-response model to the same data set.This paper was prepared with partial support from the United States Environmental Protection Agency under a Cooperative Agreement Number CR-815273. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

Wenger SJ  Freeman MC 《Ecology》2008,89(10):2953-2959
Researchers have developed methods to account for imperfect detection of species with either occupancy (presence absence) or count data using replicated sampling. We show how these approaches can be combined to simultaneously estimate occurrence, abundance, and detection probability by specifying a zero-inflated distribution for abundance. This approach may be particularly appropriate when patterns of occurrence and abundance arise from distinct processes operating at differing spatial or temporal scales. We apply the model to two data sets: (1) previously published data for a species of duck, Anas platyrhynchos, and (2) data for a stream fish species, Etheostoma scotti. We show that in these cases, an incomplete-detection zero-inflated modeling approach yields a superior fit to the data than other models. We propose that zero-inflated abundance models accounting for incomplete detection be considered when replicate count data are available.  相似文献   

Modeling empirical distributions of repeated counts with parametric probability distributions is a frequent problem when studying species abundance. One must choose a family of distributions which is flexible enough to take into account very diverse patterns and possess parameters with clear biological/ecological interpretations. The negative binomial distribution fulfills these criteria and was selected for modeling counts of marine fish and invertebrates. This distribution depends on a vector \(\left( K,\mathfrak {P}\right) \) of parameters, and ranges from the Poisson distribution (when \(K\rightarrow +\infty \)) to Fisher’s log-series, when \(K\rightarrow 0\). Moreover, these parameters have biological/ecological interpretations which are detailed in the literature and in this study. We compared three estimators of K, \(\mathfrak {P}\) and the parameter \(\alpha \) of Fisher’s log-series, following the work of Rao CR (Statistical ecology. Pennsylvania State University Press, University Park, 1971) on a three-parameter unstandardized variant of the negative binomial distribution. We further investigated the coherence underlying parameter values resulting from the different estimators, using both real count data collected in the Mauritanian Exclusive Economic Zone (MEEZ) during the period 1987–2010 and realistic simulations of these data. In the case of the MEEZ, we first built homogeneous lists of counts (replicates), by gathering observations of each species with respect to “typical environments” obtained by clustering the sampled stations. The best estimation of \(\left( K,\mathfrak {P}\right) \) was generally obtained by penalized minimum Hellinger distance estimation. Interestingly, the parameters of most of the correctly sampled species seem compatible with the classical birth-and-dead model of population growth with immigration by Kendall (Biometrika 35:6–15, 1948).  相似文献   

Knowledge of the relationship between species traits and species distribution in fragmented landscapes is important for understanding current distribution patterns and as background information for predictive models of the effect of future landscape changes. The existing studies on the topic suffer from several drawbacks. First, they usually consider only traits related to dispersal ability and not growth. Furthermore, they do not apply phylogenetic corrections, and we thus do not know how considerations of phylogenetic relationships can alter the conclusions. Finally, they usually apply only one technique to calculate habitat isolation, and we do not know how other isolation measures would change the results. We studied the issues using 30 species forming congeneric pairs occurring in fragmented dry grasslands. We measured traits related to dispersal, survival, and growth in the species and recorded distribution of the species in 215 grassland fragments. We show many strong relationships between species traits related to both dispersal and growth and species distribution in the landscape, such as the positive relationship between habitat occupancy and anemochory and negative relationships between habitat occupancy and seed dormancy. The directions of these relationships, however, often change after application of phylogenetic correction. For example, more isolated habitats host species with smaller seeds. After phylogenetic correction, however, they turn out to host species with larger seeds. The conclusions also partly change depending on how we calculate habitat isolation. Specifically, habitat isolation calculated from occupied habitats only has the highest predictive power. This indicates slow dynamics of the species. All the results support the expectation that species traits have a high potential to explain patterns of species distribution in the landscape and that they can be used to build predictive models of species distribution. The specific conclusions are, however, dependent on the technique used, and we should carefully consider this when comparing among different studies. Since different techniques answer slightly different questions, we should attempt to use analyses both with and without phylogenetic correction and explore different isolation measures whenever possible and compare the results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号