首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Various methods exist to model a species’ niche and geographic distribution using environmental data for the study region and occurrence localities documenting the species’ presence (typically from museums and herbaria). In presence-only modelling, geographic sampling bias and small sample sizes represent challenges for many species. Overfitting to the bias and/or noise characteristic of such datasets can seriously compromise model generality and transferability, which are critical to many current applications - including studies of invasive species, the effects of climatic change, and niche evolution. Even when transferability is not necessary, applications to many areas, including conservation biology, macroecology, and zoonotic diseases, require models that are not overfit. We evaluated these issues using a maximum entropy approach (Maxent) for the shrew Cryptotis meridensis, which is endemic to the Cordillera de Mérida in Venezuela. To simulate strong sampling bias, we divided localities into two datasets: those from a portion of the species’ range that has seen high sampling effort (for model calibration) and those from other areas of the species’ range, where less sampling has occurred (for model evaluation). Before modelling, we assessed the climatic values of localities in the two datasets to determine whether any environmental bias accompanies the geographic bias. Then, to identify optimal levels of model complexity (and minimize overfitting), we made models and tuned model settings, comparing performance with that achieved using default settings. We randomly selected localities for model calibration (sets of 5, 10, 15, and 20 localities) and varied the level of model complexity considered (linear versus both linear and quadratic features) and two aspects of the strength of protection against overfitting (regularization). Environmental bias indeed corresponded to the geographic bias between datasets, with differences in median and observed range (minima and/or maxima) for some variables. Model performance varied greatly according to the level of regularization. Intermediate regularization consistently led to the best models, with decreased performance at low and generally at high regularization. Optimal levels of regularization differed between sample-size-dependent and sample-size-independent approaches, but both reached similar levels of maximal performance. In several cases, the optimal regularization value was different from (usually higher than) the default one. Models calibrated with both linear and quadratic features outperformed those made with just linear features. Results were remarkably consistent across the examined sample sizes. Models made with few and biased localities achieved high predictive ability when appropriate regularization was employed and optimal model complexity was identified. Species-specific tuning of model settings can have great benefits over the use of default settings.  相似文献   

2.
A recent trend is to estimate landscape metrics using sample data and cost-efficiency is one important reason for this development. In this study, line intersect sampling (LIS) was used as an alternative to wall-to-wall mapping for estimating Shannon’s diversity index and edge length and density. Monte Carlo simulation was applied to study the statistical performance of the estimators. All combinations of two sampling designs (random and systematic distribution of transects), four sample sizes, five transect configurations (straight line, L, Y, triangle, and quadrat), two transect orientations (fixed and random), and three configuration lengths were tested, each with a large number of simulations. Reference was 50 photos of size 1 km2, already manually delineated in vector format by photo interpreters using GIS environment. The performance was compared by root mean square error (RMSE) and bias. The best combination for all three metrics was found to be the systematic design and as response design the straight line configuration with random orientation of transects, with little difference between the fixed and random orientation of transects. The rate of decrease of RMSE for increasing sample size and line length was studied with a mixed linear model. It was found that the RMSE decreased to a larger degree with the systematic design than the random one, especially with increasing sample size. Due to the nonlinearity in the definition of Shannon diversity estimator its estimator has a small and negative bias, decreasing with sample size and line length. Finally, a time study was conducted, measuring the time for registration of line intersections and their lengths on non-delineated aerial photos. The time study showed that long sampling lines were more cost-efficient than short ones for photo-interpretation.  相似文献   

3.
Ranked set sampling can provide an efficient basis for estimating parameters of environmental variables, particularly when sampling costs are intrinsically high. Various ranked set estimators are considered for the population mean and contrasted in terms of their efficiencies and useful- ness, with special concern for sample design considerations. Specifically, we consider the effects of the form of the underlying random variable, optimisation of efficiency and how to allocate sampling effort for best effect (e.g. one large sample or several smaller ones of the same total size). The various prospects are explored for two important positively skew random variables (lognormal and extreme value) and explicit results are given for these cases. Whilst it turns out that the best approach is to use the largest possible single sample and the optimal ranked set best linear estimator (ranked set BLUE), we find some interesting qualitatively different conclusions for the two skew distributions  相似文献   

4.
Fieberg J 《Ecology》2007,88(4):1059-1066
Two oft-cited drawbacks of kernel density estimators (KDEs) of home range are their sensitivity to the choice of smoothing parameter(s) and their need for independent data. Several simulation studies have been conducted to compare the performance of objective, data-based methods of choosing optimal smoothing parameters in the context of home range and utilization distribution (UD) estimation. Lost in this discussion of choice of smoothing parameters is the general role of smoothing in data analysis, namely, that smoothing serves to increase precision at the cost of increased bias. A primary goal of this paper is to illustrate this bias-variance trade-off by applying KDEs to sampled locations from simulated movement paths. These simulations will also be used to explore the role of autocorrelation in estimating UDs. Autocorrelation can be reduced (1) by increasing study duration (for a fixed sample size) or (2) by decreasing the sampling rate. While the first option will often be reasonable, for a fixed study duration higher sampling rates should always result in improved estimates of space use. Further, KDEs with typical data-based methods of choosing smoothing parameters should provide competitive estimates of space use for fixed study periods unless autocorrelation substantially alters the optimal level of smoothing.  相似文献   

5.
We propose a Bayesian hierarchical modeling approach for estimating the size of a closed population from data obtained by identifying individuals through photographs of natural markings. We assume that noisy measurements of a set of distinctive features are available for each individual present in a photographic catalogue. To estimate the population size from two catalogues obtained during two different sampling occasions, we embed the standard two-stage $M_t$ capture–recapture model for closed population into a multivariate normal data matching model that identifies the common individuals across the catalogues. In addition to estimating the population size while accounting for the matching process uncertainty, this hierarchical modelling approach allows to identify the common individuals by using the information provided by the capture–recapture model. This way, our model also represents a novel and reliable tool able to reduce the amount of effort researchers have to expend in matching individuals. We illustrate and motivate the proposed approach via a real data set of photo-identification of narwhals. Moreover, we compare our method with a set of possible alternative approaches by using both the empirical data set and a simulation study.  相似文献   

6.
Recently the two-phase adaptive stratified sampling design proposed by Francis (1984) has been extended by Manly et al. (2002) for situations where several biological populations are sampled simultaneously, and where this is done at several different geographical locations in order to estimate population totals or means. The method uses the results from a first phase sample to decide how best to allocate a second phase sample to locations and strata, in order to maximise a criterion (based on estimated coefficients of variation) that measures the accuracy of estimation for population totals, for all variables at all locations. One potential problem with this method is bias in the estimators of the population totals and means. In this paper bootstrapping is considered as a means of overcoming these biases. It is shown using model populations of Pacific walrus and shellfish, based on real data, that bootstrapping is a useful tool for removing about half of the bias. This is also confirmed from some simulations using artificial data.  相似文献   

7.
Many simulation studies have examined the properties of distance sampling estimators of wildlife population size. When assumptions hold, if distances are generated from a detection model and fitted using the same model, they are known to perform well. However, in practice, the true model is unknown. Therefore, standard practice includes model selection, typically using model comparison tools like Akaike Information Criterion. Here we examine the performance of standard distance sampling estimators under model selection. We compare line and point transect estimators with distances simulated from two detection functions, hazard-rate and exponential power series (EPS), over a range of sample sizes. To mimic the real-world context where the true model may not be part of the candidate set, EPS models were not included as candidates, except for the half-normal parameterization. We found median bias depended on sample size (being asymptotically unbiased) and on the form of the true detection function: negative bias (up to 15% for line transects and 30% for point transects) when the shoulder of maximum detectability was narrow, and positive bias (up to 10% for line transects and 15% for point transects) when it was wide. Generating unbiased simulations requires careful choice of detection function or very large datasets. Practitioners should collect data that result in detection functions with a shoulder similar to a half-normal and use the monotonicity constraint. Narrow-shouldered detection functions can be avoided through good field procedures and those with wide shoulder are unlikely to occur, due to heterogeneity in detectability.  相似文献   

8.
Adaptive two-stage one-per-stratum sampling   总被引:1,自引:0,他引:1  
We briefly describe adaptive cluster sampling designs in which the initial sample is taken according to a Markov chain one-per-stratum design (Breidt, 1995) and one or more secondary samples are taken within strata if units in the initial sample satisfy a given condition C. An empirical study of the behavior of the estimation procedure is conducted for three small artificial populations for which adaptive sampling is appropriate. The specific sampling strategy used in the empirical study was a single random-start systematic sample with predefined systematic samples within strata when the initially sampled unit in that stratum satisfies C. The bias of the Horvitz-Thompson estimator for this design is usually very small when adaptive sampling is conducted in a population for which it is suited. In addition, we compare the behavior of several alternative estimators of the standard error of the Horvitz-Thompson estimator of the population total. The best estimator of the standard error is population-dependent but it is not unreasonable to use the Horvitz-Thompson estimator of the variance. Unfortunately, the distribution of the estimator is highly skewed hence the usual approach of constructing confidence intervals assuming normality cannot be used here.  相似文献   

9.
Models that predict distribution are now widely used to understand the patterns and processes of plant and animal occurrence as well as to guide conservation and management of rare or threatened species. Application of these methods has led to corresponding studies evaluating the sensitivity of model performance to requisite data and other factors that may lead to imprecise or false inferences. We expand upon these works by providing a relative measure of the sensitivity of model parameters and prediction to common sources of error, bias, and variability. We used a one-at-a-time sample design and GPS location data for woodland caribou (Rangifer tarandus caribou) to assess one common species-distribution model: a resource selection function. Our measures of sensitivity included change in coefficient values, prediction success, and the area of mapped habitats following the systematic introduction of geographic error and bias in occurrence data, thematic misclassification of resource maps, and variation in model design. Results suggested that error, bias and model variation have a large impact on the direct interpretation of coefficients. Prediction success and definition of important habitats were less responsive to the perturbations we introduced to the baseline model. Model coefficients, prediction success, and area of ranked habitats were most sensitive to positional error in species locations followed by sampling bias, misclassification of resources, and variation in model design. We recommend that researchers report, and practitioners consider, levels of error and bias introduced to predictive species-distribution models. Formal sensitivity and uncertainty analyses are the most effective means for evaluating and focusing improvements on input data and considering the range of values possible from imperfect models.  相似文献   

10.
A hierarchical model for spatial capture-recapture data   总被引:1,自引:0,他引:1  
Royle JA  Young KV 《Ecology》2008,89(8):2281-2289
Estimating density is a fundamental objective of many animal population studies. Application of methods for estimating population size from ostensibly closed populations is widespread, but ineffective for estimating absolute density because most populations are subject to short-term movements or so-called temporary emigration. This phenomenon invalidates the resulting estimates because the effective sample area is unknown. A number of methods involving the adjustment of estimates based on heuristic considerations are in widespread use. In this paper, a hierarchical model of spatially indexed capture-recapture data is proposed for sampling based on area searches of spatial sample units subject to uniform sampling intensity. The hierarchical model contains explicit models for the distribution of individuals and their movements, in addition to an observation model that is conditional on the location of individuals during sampling. Bayesian analysis of the hierarchical model is achieved by the use of data augmentation, which allows for a straightforward implementation in the freely available software WinBUGS. We present results of a simulation study that was carried out to evaluate the operating characteristics of the Bayesian estimator under variable densities and movement patterns of individuals. An application of the model is presented for survey data on the flat-tailed horned lizard (Phrynosoma mcallii) in Arizona, USA.  相似文献   

11.
Efficiency of composite sampling for estimating a lognormal distribution   总被引:1,自引:0,他引:1  
In many environmental studies measuring the amount of a contaminant in a sampling unit is expensive. In such cases, composite sampling is often used to reduce data collection cost. However, composite sampling is known to be beneficial for estimating the mean of a population, but not necessarily for estimating the variance or other parameters. As some applications, for example, Monte Carlo risk assessment, require an estimate of the entire distribution, and as the lognormal model is commonly used in environmental risk assessment, in this paper we investigate efficiency of composite sampling for estimating a lognormal distribution. In particular, we examine the magnitude of savings in the number of measurements over simple random sampling, and the nature of its dependence on composite size and the parameters of the distribution utilizing simulation and asymptotic calculations.  相似文献   

12.
Laboratory analyses in a variety of contexts may result in left- and interval-censored measurements. We develop and evaluate a maximum likelihood approach to linear regression analysis in this setting and compare this approach to commonly used simple substitution methods. We explore via simulation the impact on bias and power of censoring fraction and sample size in a range of settings. The maximum likelihood approach represents only a moderate increase in power, but we show that the bias in substitution estimates may be substantial.  相似文献   

13.
The mark-resight method for estimating the size of a closed population can in many circumstances be a less expensive and less invasive alternative to traditional mark-recapture. Despite its potential advantages, one major drawback of traditional mark-resight methodology is that the number of marked individuals in the population available for resighting needs to be known exactly. In real field studies, this can be quite difficult to accomplish. Here we develop a Bayesian model for estimating abundance when sighting data are acquired from distinct sampling occasions without replacement, but the exact number of marked individuals is unknown. By first augmenting the data with some fixed number of individuals comprising a marked “super population,” the problem may then be reformulated in terms of estimating the proportion of this marked super population that was actually available for resighting. This then allows the data for the marked population available for resighting to be modeled as random realizations from a binomial logit-normal distribution. We demonstrate the use of our model to estimate the New Zealand robin (Petroica australis) population size in a region of Fiordland National Park, New Zealand. We then evaluate the performance of the proposed model relative to other estimators via a series of simulation experiments. We generally found our model to have advantages over other models when sample sizes are smaller with individually heterogeneous resighting probabilities. Due to limited budgets and the inherent variability between individuals, this is a common occurrence in mark-resight population studies. WinBUGS and R code to carry out these analyses is available from .  相似文献   

14.
We introduce a methodology to infer zones of high potential for the habitat of a species, useful for management of biodiversity, conservation, biogeography, ecology, or sustainable use. Inference is based on a set of sites where the presence of the species has been reported. Each site is associated with covariate values, measured on discrete scales. We compute the predictive probability that the species is present at each node of a regular grid. Possible spatial bias for sites of presence is accounted for. Since the resulting posterior distribution does not have a closed form, a Markov chain Monte Carlo (MCMC) algorithm is implemented. However, we also describe an approximation to the posterior distribution, which avoids MCMC. Relevant features of the approach are that specific notions of data acquisition such as sampling intensity and detectability are accounted for, and that available a priori information regarding areas of distribution of the species is incorporated in a clear-cut way. These concepts, arising in the presence-only context, are not addressed in alternative methods. We also consider an uncertainty map, which measures the variability for the predictive probability at each node on the grid. A simulation study is carried out to test and compare our approach with other standard methods. Two case studies are also presented.  相似文献   

15.
The mean of a balanced ranked set sample is more efficient than the mean of a simple random sample of equal size and the precision of ranked set sampling may be increased by using an unbalanced allocation when the population distribution is highly skewed. The aim of this paper is to show the practical benefits of the unequal allocation in estimating simultaneously the means of more skewed variables through real data. In particular, the allocation rule suggested in the literature for a single skewed distribution may be easily applied when more than one skewed variable are of interest and an auxiliary variable correlated with them is available. This method can lead to substantial gains in precision for all the study variables with respect to the simple random sampling, and to the balanced ranked set sampling too.  相似文献   

16.
Estimates of a population’s growth rate and process variance from time-series data are often used to calculate risk metrics such as the probability of quasi-extinction, but temporal correlations in the data from sampling error, intrinsic population factors, or environmental conditions can bias process variance estimators and detrimentally affect risk predictions. It has been claimed (McNamara and Harding, Ecol Lett 7:16–20, 2004) that estimates of the long-term variance that incorporate observed temporal correlations in population growth are unaffected by sampling error; however, no estimation procedures were proposed for time-series data. We develop a suite of such long-term variance estimators, and use simulated data with temporally autocorrelated population growth and sampling error to evaluate their performance. In some cases, we get nearly unbiased long-term variance estimates despite ignoring sampling error, but the utility of these estimators is questionable because of large estimation uncertainty and difficulties in estimating correlation structure in practice. Process variance estimators that ignored temporal correlations generally gave more precise estimates of the variability in population growth and of the probability of quasi-extinction. We also found that the estimation of probability of quasi-extinction was greatly improved when quasi-extinction thresholds were set relatively close to population levels. Because of precision concerns, we recommend using simple models for risk estimates despite potential biases, and limiting inference to quantifying relative risk; e.g., changes in risk over time for a single population or comparative risk among populations.  相似文献   

17.
Shen TJ  He F 《Ecology》2008,89(7):2052-2060
Most richness estimators currently in use are derived from models that consider sampling with replacement or from the assumption of infinite populations. Neither of the assumptions is suitable for sampling sessile organisms such as plants where quadrats are often sampled without replacement and the area of study is always limited. In this paper, we propose an incidence-based parametric richness estimator that considers quadrat sampling without replacement in a fixed area. The estimator is derived from a zero-truncated binomial distribution for the number of quadrats containing a given species (e.g., species i) and a modified beta distribution for the probability of presence-absence of a species in a quadrat. The maximum likelihood estimate of richness is explicitly given and can be easily solved. The variance of the estimate is also obtained. The performance of the estimator is tested against nine other existing incidence-based estimators using two tree data sets where the true numbers of species are known. Results show that the new estimator is insensitive to sample size and outperforms the other methods as judged by the root mean squared errors. The superiority of the new method is particularly noticeable when large quadrat size is used, suggesting that a few large quadrats are preferred over many small ones when sampling diversity.  相似文献   

18.
A probabilistic sampling approach for design-unbiased estimation of area-related quantitative characteristics of spatially dispersed population units is proposed. The developed field protocol includes a fixed number of 3 units per sampling location and is based on partial triangulations over their natural neighbors to derive the individual inclusion probabilities. The performance of the proposed design is tested in comparison to fixed area sample plots in a simulation with two forest stands. Evaluation is based on a general approach for areal sampling in which all characteristics of the resulting population of possible samples is derived analytically by means of a complete tessellation of the areal sampling frame. The example simulation shows promising results. Expected errors under this design are comparable to sample plots including a much greater number of trees per plot.  相似文献   

19.
Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size-biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a lineal or areal function of the random variable of interest, respectively. Often, interest is in estimating a parametric probability density of the data. In forestry, the Weibull function has been used extensively for such purposes. Estimating equations for method of moments and maximum likelihood for two- and three-parameter Weibull distributions are presented. Fitting is illustrated with an example from an area-biased angle-gauge sample of standing trees in a woodlot. Finally, some specific points concerning the form of the size-biased densities are reported.  相似文献   

20.
Munoz F  Couteron P  Ramesh BR  Etienne RS 《Ecology》2007,88(10):2482-2488
The neutral theory of S. P. Hubbell postulates a two-scale hierarchical framework consisting of a metacommunity following the speciation-drift equilibrium characterized by the "biodiversity number" theta, and local communities following the migration-drift equilibrium characterized by the "migration rate" m (or the "fundamental dispersal number" I). While Etienne's sampling formula allows simultaneous estimation of theta and m from a single sample of a local community, its applicability to a network of (rather small) samples is questionable. We define here an alternative two-stage approach estimating theta from an adequate subset of the individuals sampled in the field (using Ewens' sampling formula) and m from community samples (using Etienne's sampling formula). We compare its results with the simultaneous estimation of theta and m (one-stage estimation), for simulated neutral samples and for 50 1-ha plots of evergreen forest in South India. The one-stage approach exhibits problems of bias and of poor differentiability between high-theta, low-m and low-theta, high-m solution domains. Conversely, the two-stage approach yielded reasonable estimates and is to be preferred when several small, scattered plots are available instead of a single large one.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号