首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Aboveground biomass (AGB) reflects multiple and often undetermined ecological and land-use processes, yet detailed landscape-level studies of AGB are uncommon due to the difficulty in making consistent measurements at ecologically relevant scales. Working in a protected mediterranean-type landscape (Jasper Ridge Biological Preserve, California, USA), we combined field measurements with remotely sensed data from the Carnegie Airborne Observatory's light detection and ranging (lidar) system to create a detailed AGB map. We then developed a predictive model using a maximum of 56 explanatory variables derived from geologic and historic-ownership maps, a digital elevation model, and geographic coordinates to evaluate possible controls over currently observed AGB patterns. We tested both ordinary least-squares regression (OLS) and autoregressive approaches. OLS explained 44% of the variation in AGB, and simultaneous autoregression with a 100-m neighborhood improved the fit to an r2 = 0.72, while reducing the number of significant predictor variables from 27 variables in the OLS model to 11 variables in the autoregressive model. We also compared the results from these approaches to a more typical field-derived data set; we randomly sampled 5% of the data 1000 times and used the same OLS approach each time. Environmental filters including incident solar radiation, substrate type, and topographic position were significant predictors of AGB in all models. Past ownership was a minor but significant predictor, despite the long history of conservation at the site. The weak predictive power of these environmental variables, and the significant improvement when spatial autocorrelation was incorporated, highlight the importance of land-use history, disturbance regime, and population dynamics as controllers of AGB.  相似文献   

2.
Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.  相似文献   

3.
When individual model parameters must be measured in field or laboratory experiments, the provision of feedback information for allocation of research efforts is an important function of modeling. Both sensitivity analysis and Monte Carlo error analysis can be used to determine which parameters require intensified measurement effort. When both methods are applied to a stream ecosystem model, the assumptions of sensitivity analysis are violated if reasonable estimates of measurement errors on parameters are used. Sensitivity analysis estimates a linear relationship between a state variable and a parameter and largely ignores higher order effects.In the model investigated in this study, higher-order effects dominate prediction error, and the results of sensitivity analysis are misleading. It is suggested that the simple correlation coefficient derived from analysis of Monte Carlo simulations is a more reasonable way to rank model parameters according to their contribution to prediction uncertainty. For the stream model used in this study, halving variance on the four parameters, indicated as most important by sensitivity analysis, reduces prediction errors by only 2–6%. Halving variance on the four, completely different, parameters with the largest simple correlation coefficients reduces prediction errors by 17–31%.  相似文献   

4.
Neglected biological patterns in the residuals   总被引:1,自引:0,他引:1  
One of the fundamental assumptions underlying linear regression models is that the errors have a constant variance (i.e., homoscedastic). When this assumption is violated, standard errors from a regression can be biased and inconsistent, meaning that the associated p values and 95% confidence intervals cannot be trusted. The assumption of homoscedasticity is made for statistical reasons rather than biological reasons; in most real datasets, some form of heteroscedasticity is likely to exist. However, a survey of the behavioural ecology literature showed that only about 5% of articles explicitly mentioned heteroscedasticity, leaving 95% of articles in which heteroscedasticity was apparently absent. These results strongly indicate that the prevalence of heteroscedasticity is widely under-reported within behavioural ecology. The aim of this article is to raise awareness of heteroscedasticity amongst behavioural ecologists. Using topical examples from fields in behavioural ecology such as sexual dimorphism and animal personality, we highlight the biological importance of considering heteroscedasticity. We also emphasize that researchers should pay closer attention to the variance in their data and consider what factors could cause heteroscedasticity. In addition, we introduce some simple methods of dealing with heteroscedasticity. The two methods we focus on are: (1) incorporating variance functions within a generalised least squares (GLS) framework to model the functional form of heteroscedasticity and; (2) heteroscedasticity-consistent standard error (HCSE) estimators, which can be used when the functional form of heteroscedasticity is unknown. Using case studies, we show how both methods can influence the output from linear regression models. Finally, we hope that more researchers will consider heteroscedasticity as an important source of additional information about the particular biological process being studied, rather than an impediment to statistical analysis.  相似文献   

5.
Estimates of a population’s growth rate and process variance from time-series data are often used to calculate risk metrics such as the probability of quasi-extinction, but temporal correlations in the data from sampling error, intrinsic population factors, or environmental conditions can bias process variance estimators and detrimentally affect risk predictions. It has been claimed (McNamara and Harding, Ecol Lett 7:16–20, 2004) that estimates of the long-term variance that incorporate observed temporal correlations in population growth are unaffected by sampling error; however, no estimation procedures were proposed for time-series data. We develop a suite of such long-term variance estimators, and use simulated data with temporally autocorrelated population growth and sampling error to evaluate their performance. In some cases, we get nearly unbiased long-term variance estimates despite ignoring sampling error, but the utility of these estimators is questionable because of large estimation uncertainty and difficulties in estimating correlation structure in practice. Process variance estimators that ignored temporal correlations generally gave more precise estimates of the variability in population growth and of the probability of quasi-extinction. We also found that the estimation of probability of quasi-extinction was greatly improved when quasi-extinction thresholds were set relatively close to population levels. Because of precision concerns, we recommend using simple models for risk estimates despite potential biases, and limiting inference to quantifying relative risk; e.g., changes in risk over time for a single population or comparative risk among populations.  相似文献   

6.
The high species diversity of some ecosystems like tropical rainforests goes in pair with the scarcity of data for most species. This hinders the development of models that require enough data for fitting. The solution commonly adopted by modellers consists in grouping species to form more sizeable data sets. Classical methods for grouping species such as hierarchical cluster analysis do not take account of the variability of the species characteristics used for clustering. In this study a clustering method based on aggregation theory is presented. It takes account of the variability of species characteristics by searching for the grouping that minimizes the quadratic error (square bias plus variance) of some model’s prediction. This method allows one to check whether the gain in variance brought by data pooling compensate for the bias that it introduces. This method was applied to a data set on 94 tree species in a tropical rainforest in French Guiana, using a Usher matrix model to predict species dynamics. An optimal trade-off between bias and variance was found when grouping species. Grouping species appeared to decrease the quadratic error, except when the number of groups was very small. This clustering method yielded species groups similar to those of the hierarchical cluster analysis using Ward’s method when variance was small, that is when the number of groups was small.  相似文献   

7.
Eradication and control of invasive species are often possible only if populations are detected when they are small and localized. To be efficient, detection surveys should be targeted at locations where there is the greatest risk of incursions. We examine the utility of habitat suitability index (HSI) and particle dispersion models for targeting sampling for marine pests. Habitat suitability index models are a simple way to identify suitable habitat when species distribution data are lacking. We compared the performance of HSI models with statistical models derived from independent data from New Zealand on the distribution of two nonindigenous bivalves: Theora lubrica and Musculista senhousia. Logistic regression models developed using the HSI scores as predictors of the presence/absence of Theora and Musculista explained 26.7% and 6.2% of the deviance in the data, respectively. Odds ratios for the HSI scores were greater than unity, indicating that they were genuine predictors of the presence/ absence of each species. The fit and predictive accuracy of each logistic model were improved when simulated patterns of dispersion from the nearest port were added as a predictor variable. Nevertheless, the combined model explained, at best, 46.5% of the deviance in the distribution of Theora and correctly predicted 56% of true presences and 50% of all cases. Omission errors were between 6% and 16%. Although statistical distribution models built directly from environmental predictors always outperformed the equivalent HSI models, the gain in model fit and accuracy was modest. High residual deviance in both types of model suggests that the distributions realized by Theora and Musculista in the field data were influenced by factors not explicitly modeled as explanatory variables and by error in the environmental data used to project suitable habitat for the species. Our results highlight the difficulty of accurately predicting the distribution of invasive marine species that exhibit low habitat occupancy and patchy distributions in time and space. Although the HSI and statistical models had utility as predictors of the likely distribution of nonindigenous marine species, the level of spatial accuracy achieved with them may be well below expectations for sensitive surveillance programs.  相似文献   

8.
Beach nourishment is a policy used to rebuild eroding beaches with sand dredged from other locations. Previous studies indicate that beach width positively affects coastal property values, but these studies ignore the dynamic features of beaches and the feedback that nourishment has on shoreline retreat. We correct for the resulting attenuation and endogeneity bias in a hedonic property value model by instrumenting for beach width using spatially varying coastal geological features. We find that the beach width coefficient is nearly five times larger than the OLS estimate, suggesting that beach width is a much larger portion of property value than previously thought. We use the empirical results to parameterize a dynamic optimization model of beach nourishment decisions and show that the predicted interval between nourishment projects is closer to what we observe in the data when we use the estimate from the instrumental variables model rather than OLS. As coastal communities adapt to climate change, we find that the long-term net value of coastal residential property can fall by as much as 52% when erosion rate triples and cost of nourishment sand quadruples.  相似文献   

9.
Geostatistical models play an important role in spatial data analysis, in which model selection is inevitable. Model selection methods, such as AIC and BIC, are popular for selecting appropriate models. In recent years, some model averaging methods, such as smoothed AIC and smoothed BIC, are also applied to spatial data models. However, the corresponding averaging estimators are outperformed by optimal model averaging estimators (Hansen in Econometrica 75:1175–1189, 2007) for the ordinary linear models. Therefore, this paper focuses on the optimal model averaging method for geostatistical models. We propose a weight choice criterion for the model averaging estimator on the basis of the generalized degrees of freedom and data perturbation technique. We further theoretically prove the resultant estimator is asymptotically optimal in terms of the mean squared error, and numerically demonstrate its satisfactory performance. Finally, the proposed method is applied to a mercury data set.  相似文献   

10.
This paper compares the procedures based on the extended quasi-likelihood, pseudo-likelihood and quasi-likelihood approaches for testing homogeneity of several proportions for over-dispersed binomial data. The type I error of the Wald tests using the model-based and robust variance estimates, the score test, and the extended quasi-likelihood ratio test (deviance reduction test) were examined by simulation. The extended quasi-likelihood method performs less well when mean responses are close to 1 or 0. The model-based Wald test based on the quasi-likelihood performs the best in maintaining the nominal level. The score test performs less well when the intracluster correlations are large or heterogeneous. In summary: (i) both the quasilikelihood and pseudo-likelihood methods appear to be acceptable but care must be taken when overfitting a variance function with small sample sizes; (ii) the extended quasi-likelihood approach is the least favourable method because its nominal level is much too high; and (iii) the robust variance estimator performs poorly, particularly when the sample size is small.  相似文献   

11.
Cross-correlation analysis is the most valuable and widely used statistical tool for evaluating the strength and direction of time-lagged relationships between ecological variables. Although it is well understood that temporal autocorrelation can inflate estimates of cross correlations and cause high rates of incorrectly concluding that lags exist among time series (i.e. type I error), in this study we show that a problem we term intra-multiplicity can cause substantial bias in cross-correlation analysis even in the absence of autocorrelation. Intra-multiplicity refers to the numerous time lags examined and cross-correlation coefficients computed within a pair of time series during cross-correlation analysis. We show using Monte Carlo simulations that intra-multiplicity can spuriously inflate estimates of cross correlations by identifying incorrect time lags. Further, unlike autocorrelation, which generally identifies lags close to the true lag, intra-multiplicity can erroneously identify lags anywhere in the time series and commonly results in a direction change of the correlation (i.e. positive or negative). Using Monte Carlo simulations we develop formulas that quantify the bias introduced by intra-multiplicity as a function of sample size, true cross correlation between the series, and the number of time lags examined. A priori these formulas enable researchers to determine the sample size needed to minimize the biases introduced by intra-multiplicity. A posteriori the formulas can be used to predict the expected bias and type I error rate associated with the data at hand, as well as the maximum number of time lags that can be analyzed to minimize the effects of intra-multiplicity. We examine the relationship between commercial catch of chum salmon and surface temperatures of the North Pacific (1925–1992) to illustrate the problems of intra-multiplicity in fisheries studies and the application of our formulas. These analyses provide a more robust framework to assess the temporal relationships between ecological variables. Received: 28 July 2000 / Accepted: 6 December 2000  相似文献   

12.
Murtaugh PA 《Ecology》2007,88(1):56-62
I argue that ecological data analyses are often needlessly complicated, and I present two examples of published analyses for which simpler alternatives are available. Unnecessary complexity is often introduced when analysts focus on subunits of the key experimental or observational units in a study, or use a very general framework to present an analysis that is a simple special case. Simpler analyses are easier to explain and understand; they clarify what the key units in a study are; they reduce the chances for computational mistakes; and they are more likely to lead to the same conclusions when applied by different analysts to the same data.  相似文献   

13.
Statistical packages such as edgeR and DESeq are intended to detect genes that are relevant to phenotypic traits and diseases. A few studies have also modeled the relationships between gene expressions and traits. In the presence of multicollinearity and outliers, which are unavoidable in genetic data, the robust ridge regression estimator can be applied with the trait value as the response variable and the gene expressions as explanatory variables. In some simulation scenarios, the robust ridge estimator is resistant to outliers and less susceptible to multicollinearity than the ordinary least-squares (OLS) estimator. This study investigated the reliability of the robust ridge estimator, in a scenario where the explanatory variables have tail-dependence and negative binomial distributions, by comparing its performance to that of OLS using vine copula to model the tail-dependence among gene expressions. The robust ridge estimator and OLS were both applied to an ecological dataset. First, statistical analysis was used to compare RNA sequencing data between two treatments; then, 15 differentially expressed genes were selected. Next, the regression parameter estimates of robust ridge and OLS for the effects of the 15 contigs (explanatory variables) on trait values (response variables) were compared. Robust ridge regression was found to detect fewer positive and negative slopes than OLS regression. These results indicate that robust ridge regression can be successfully applied for RNA sequencing analysis to estimate the effect of trait-associated genes using real data, and holds great promise as a tool for modeling the association between RNA expression and phenotypic traits.  相似文献   

14.
Anthropogenic environmental impacts can disrupt the sensory environment of animals and affect important processes from mate choice to predator avoidance. Currently, these effects are best understood for auditory and chemosensory modalities, and recent reviews highlight their importance for conservation. We examined how anthropogenic changes to the visual environment (ambient light, transmission, and backgrounds) affect visual communication and camouflage and considered the implications of these effects for conservation. Human changes to the visual environment can increase predation risk by affecting camouflage effectiveness, lead to maladaptive patterns of mate choice, and disrupt mutualistic interactions between pollinators and plants. Implications for conservation are particularly evident for disrupted camouflage due to its tight links with survival. The conservation importance of impaired visual communication is less documented. The effects of anthropogenic changes on visual communication and camouflage may be severe when they affect critical processes such as pollination or species recognition. However, when impaired mate choice does not lead to hybridization, the conservation consequences are less clear. We suggest that the demographic effects of human impacts on visual communication and camouflage will be particularly strong when human‐induced modifications to the visual environment are evolutionarily novel (i.e., very different from natural variation); affected species and populations have low levels of intraspecific (genotypic and phenotypic) variation and behavioral, sensory, or physiological plasticity; and the processes affected are directly related to survival (camouflage), species recognition, or number of offspring produced, rather than offspring quality or attractiveness. Our findings suggest that anthropogenic effects on the visual environment may be of similar importance relative to conservation as anthropogenic effects on other sensory modalities.  相似文献   

15.
Peres-Neto PR  Legendre P  Dray S  Borcard D 《Ecology》2006,87(10):2614-2625
Establishing relationships between species distributions and environmental characteristics is a major goal in the search for forces driving species distributions. Canonical ordinations such as redundancy analysis and canonical correspondence analysis are invaluable tools for modeling communities through environmental predictors. They provide the means for conducting direct explanatory analysis in which the association among species can be studied according to their common and unique relationships with the environmental variables and other sets of predictors of interest, such as spatial variables. Variation partitioning can then be used to test and determine the likelihood of these sets of predictors in explaining patterns in community structure. Although variation partitioning in canonical analysis is routinely used in ecological analysis, no effort has been reported in the literature to consider appropriate estimators so that comparisons between fractions or, eventually, between different canonical models are meaningful. In this paper, we show that variation partitioning as currently applied in canonical analysis is biased. We present appropriate unbiased estimators. In addition, we outline a statistical test to compare fractions in canonical analysis. The question addressed by the test is whether two fractions of variation are significantly different from each other. Such assessment provides an important step toward attaining an understanding of the factors patterning community structure. The test is shown to have correct Type I. error rates and good power for both redundancy analysis and canonical correspondence analysis.  相似文献   

16.
Lead poisoning produces serious health problems, which are worse when a victim is younger. The US government and society have tried to prevent lead poisoning, especially since the 1970s; however, lead exposure remains prevalent. Lead poisoning analyses frequently use georeferenced blood lead level data. Like other types of data, these spatial data may contain uncertainties, such as location and attribute measurement errors, which can propagate to analysis results. For this paper, simulation experiments are employed to investigate how selected uncertainties impact regression analyses of blood lead level data in Syracuse, New York. In these simulations, location error and attribute measurement error, as well as a combination of these two errors, are embedded into the original data, and then these data are aggregated into census block group and census tract polygons. These aggregated data are analyzed with regression techniques, and comparisons are reported between the regression coefficients and their standard errors for the error added simulation results and the original results. To account for spatial autocorrelation, the eigenvector spatial filtering method and spatial autoregressive specifications are utilized with linear and generalized linear models. Our findings confirm that location error has more of an impact on the differences than does attribute measurement error, and show that the combined error leads to the greatest deviations. Location error simulation results show that smaller administrative units experience more of a location error impact, and, interestingly, coefficients and standard errors deviate more from their true values for a variable with a low level of spatial autocorrelation. These results imply that uncertainty, especially location error, has a considerable impact on the reliability of spatial analysis results for public health data, and that the level of spatial autocorrelation in a variable also has an impact on modeling results.  相似文献   

17.
《Ecological modelling》2005,186(2):154-177
In recent years alternative modeling techniques have been used to account for spatial autocorrelations among data observations. They include linear mixed model (LMM), generalized additive model (GAM), multi-layer perceptron (MLP) neural network, radial basis function (RBF) neural network, and geographically weighted regression (GWR). Previous studies show these models are robust to the violation of model assumptions and flexible to nonlinear relationships among variables. However, many of them are non-spatial in nature. In this study, we utilize a local spatial analysis method (i.e., local Moran coefficient) to investigate spatial distribution and heterogeneity in model residuals from those modeling techniques with ordinary least-squares (OLS) as the benchmark. The regression model used in this study has tree crown area as the response variable, and tree diameter and the coordinates of tree locations as the predictor variables. The results indicate that LMM, GAM, MLP and RBF may improve model fitting to the data and provide better predictions for the response variable, but they generate spatial patterns for model residuals similar to OLS. The OLS, LMM, GAM, MLP and RBF models yield more residual clusters of similar values, indicating that trees in some sub-areas are either all underestimated or all overestimated for the response variable. In contrast, GWR estimates model coefficients at each location in the study area, and produces more accurate predictions for the response variable. Furthermore, the residuals of the GWR model have more desirable spatial distributions than the ones derived from the OLS, LMM, GAM, MLP and RBF models.  相似文献   

18.
An assertion deeply rooted in the ornithological literature holds that sex-specific mortality causes a sex ratio disparity (SRD) between complete and incomplete broods. Complete broods are thought to reflect the primary sex ratio before any bias introduced by developmental mortality. Contrary to this view, however, complete and incomplete broods should exhibit identical sex ratio distributions even when the sexes experience differential mortality, as shown in the classic paper of Fiala (Am Nat 115: 442–444, 1980). Therefore, in partially unsexed samples, primary sex ratio biases cannot be distinguished from biases caused by differential mortality. In addition, complete broods do not represent primary sex ratio more accurately than incomplete ones and might even be misleading. Despite Fiala’s prediction, SRD does occur in some empirical studies. We show that this pattern could arise if (1) primary sex ratio affects chick mortality rates independently of sex (direct effect), (2) primary sex ratio covaries with a variable that also affects mortality rate, or (3) sex differential mortality covaries with overall mortality rate (indirect effects). Direct effects may cause stronger SRD than indirect ones with a smaller and opposite bias in the overall sex ratio and could also lead to highly inconsistent covariate effects on brood sex ratios. These features may help differentiate direct from indirect effects. Most interestingly, differences in covariate effects between complete and incomplete broods imply that influential variables are missing from the analysis.  相似文献   

19.
Lajeunessei MJ 《Ecology》2011,92(11):2049-2055
A common effect size metric used to quantify the outcome of experiments for ecological meta-analysis is the response ratio (RR): the log proportional change in the means of a treatment and control group. Estimates of the variance of RR are also important for meta-analysis because they serve as weights when effect sizes are averaged and compared. The variance of an effect size is typically a function of sampling error; however, it can also be influenced by study design. Here, I derive new variances and covariances for RR for several often-encountered experimental designs: when the treatment and control means are correlated; when multiple treatments have a common control; when means are based on repeated measures; and when the study has a correlated factorial design, or is multivariate. These developments are useful for improving the quality of data extracted from studies for meta-analysis and help address some of the common challenges meta-analysts face when quantifying a diversity of experimental designs with the response ratio.  相似文献   

20.
《Ecological modelling》2005,181(4):535-556
Observational models for the catch of fish at age a (or size) at time t are fundamental equations in fisheries science, linking a population model with data. The well known Baranov catch equation (which assumes that fishing and natural mortalities are constant over both age and time) is a nominal basis of those most commonly used in fish stock assessment and fish population models (which assume that fishing and natural mortalities vary with both age and time). But, what should a catch equation look like, if the instantaneous rates of fishing and natural mortalities of fish of age a at time t vary with age a and time t? Without answering this question, use of those catch equations in fish stock assessment and population models renders their results uncertain. In this paper, I derive a general catch in number or in biomass equation as observational models of an age- and time-dependent model for a fish population by Taylor series expansion of, and by directly manipulating, a general catch integral, reduce it to commonly used catch equations, and compare the performance of 11 of them using data on the western king prawn Penaeus latisulcatus. I show that the nominal generalization of the Baranov catch equation misses several terms. In so doing, I derive the catch equations more accurately and restore these missing terms. Although almost all approximations overestimate the catch per recruit for older prawns, all commonly used catch equations and their extensions perform worse than theoretically sound representations of the general catch equation and their approximations. The age-specific bias of all models is <2.5, <18 and <90% for a time interval of sampling of 1, 7 and 30 days, respectively. Such large biases even for moderate values of the length of the time interval of sampling highlight a need for assessing the utility of commonly used catch equations for individual species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号