期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Combining functional data with hierarchical Gaussian process models

Valerie Poynor Stephan Munch 《Environmental and Ecological Statistics》2017,24(2):175-199

Gaussian process models have been used in applications ranging from machine learning to fisheries management. In the Bayesian framework, the Gaussian process is used as a prior for unknown functions, allowing the data to drive the relationship between inputs and outputs. In our research, we consider a scenario in which response and input data are available from several similar, but not necessarily identical, sources. When little information is known about one or more of the populations it may be advantageous to model all populations together. We present a hierarchical Gaussian process model with a structure that allows distinct features for each source as well as shared underlying characteristics. Key features and properties of the model are discussed and demonstrated in a number of simulation examples. The model is then applied to a data set consisting of three populations of Rotifer Brachionus calyciflorus Pallas. Specifically, we model the log growth rate of the populations using a combination of lagged population sizes. The various lag combinations are formally compared to obtain the best model inputs. We then formally compare the leading hierarchical Gaussian process model with the inferential results obtained under the independent Gaussian process model. 相似文献

2.

Application of machine learning methods to palaeoecological data

Marjeta Jeraj Sašo Džeroski Ljupčo Todorovski Marko Debeljak 《Ecological modelling》2006

A palaeoecological study was conducted to investigate past environmental conditions and vegetation dynamics around the southwestern Ljubljana Moor. In order to find potential regularities and/or dependencies among co-existent plant species through time, different machine learning methods were applied to pollen records from the cores taken at Bistra and Ho?evarica. The data comprised relative pollen frequencies of the most common plant genera/families at particular core depths that correspond to particular ages in the Early and Mid Holocene periods. The applied methods include equation discovery and hierarchical clustering. Both methods have found plausible and explainable relationships among identified plant genera/families. 相似文献

3.

A generalized approach to modeling and estimating indirect effects in ecology

Clough Y 《Ecology》2012,93(8):1809-1815

The need to model and test hypotheses about complex ecological systems has led to a steady increase in use of path analytical techniques, which allow the modeling of multiple multivariate dependencies reflecting hypothesized causation and mechanisms. The aim is to achieve the estimation of direct, indirect, and total effects of one variable on another and to assess the adequacy of whole models. Path analytical techniques based on maximum likelihood currently used in ecology are rarely adequate for ecological data, which are often sparse, multi-level, and may contain nonlinear relationships as well as nonnormal response data such as counts or proportion data. Here I introduce a more flexible approach in the form of the joint application of hierarchical Bayes, Markov chain Monte Carlo algorithms, Shipley's d-sep test, and the potential outcomes framework to fit path models as well as to decompose and estimate effects. An example based on the direct and indirect interactions between ants, two insect herbivores, and a plant species demonstrates the implementation of these techniques, using freely available software. 相似文献

4.

Accounting for matching uncertainty in two stage capture–recapture experiments using photographic measurements of natural marks

Andrea Tancredi Marie Auger-Méthé Marianne Marcoux Brunero Liseo 《Environmental and Ecological Statistics》2013,20(4):647-665

We propose a Bayesian hierarchical modeling approach for estimating the size of a closed population from data obtained by identifying individuals through photographs of natural markings. We assume that noisy measurements of a set of distinctive features are available for each individual present in a photographic catalogue. To estimate the population size from two catalogues obtained during two different sampling occasions, we embed the standard two-stage $M_t$ capture–recapture model for closed population into a multivariate normal data matching model that identifies the common individuals across the catalogues. In addition to estimating the population size while accounting for the matching process uncertainty, this hierarchical modelling approach allows to identify the common individuals by using the information provided by the capture–recapture model. This way, our model also represents a novel and reliable tool able to reduce the amount of effort researchers have to expend in matching individuals. We illustrate and motivate the proposed approach via a real data set of photo-identification of narwhals. Moreover, we compare our method with a set of possible alternative approaches by using both the empirical data set and a simulation study. 相似文献

5.

A condition metric for Eucalyptus woodland derived from expert evaluations

下载免费PDF全文

Steve J. Sinclair Matthew J. Bruce Peter Griffioen Amanda Dodd Matthew D. White 《Conservation biology》2018,32(1):195-204

The evaluation of ecosystem quality is important for land‐management and land‐use planning. Evaluation is unavoidably subjective, and robust metrics must be based on consensus and the structured use of observations. We devised a transparent and repeatable process for building and testing ecosystem metrics based on expert data. We gathered quantitative evaluation data on the quality of hypothetical grassy woodland sites from experts. We used these data to train a model (an ensemble of 30 bagged regression trees) capable of predicting the perceived quality of similar hypothetical woodlands based on a set of 13 site variables as inputs (e.g., cover of shrubs, richness of native forbs). These variables can be measured at any site and the model implemented in a spreadsheet as a metric of woodland quality. We also investigated the number of experts required to produce an opinion data set sufficient for the construction of a metric. The model produced evaluations similar to those provided by experts, as shown by assessing the model's quality scores of expert‐evaluated test sites not used to train the model. We applied the metric to 13 woodland conservation reserves and asked managers of these sites to independently evaluate their quality. To assess metric performance, we compared the model's evaluation of site quality with the managers’ evaluations through multidimensional scaling. The metric performed relatively well, plotting close to the center of the space defined by the evaluators. Given the method provides data‐driven consensus and repeatability, which no single human evaluator can provide, we suggest it is a valuable tool for evaluating ecosystem quality in real‐world contexts. We believe our approach is applicable to any ecosystem. 相似文献

6.

Clustering species using a model of population dynamics and aggregation theory 总被引：1，自引：0，他引：1

Nicolas Picard Frédéric Mortier Sylvie Gourlet-Fleury 《Ecological modelling》2010,221(2):152-160

The high species diversity of some ecosystems like tropical rainforests goes in pair with the scarcity of data for most species. This hinders the development of models that require enough data for fitting. The solution commonly adopted by modellers consists in grouping species to form more sizeable data sets. Classical methods for grouping species such as hierarchical cluster analysis do not take account of the variability of the species characteristics used for clustering. In this study a clustering method based on aggregation theory is presented. It takes account of the variability of species characteristics by searching for the grouping that minimizes the quadratic error (square bias plus variance) of some model’s prediction. This method allows one to check whether the gain in variance brought by data pooling compensate for the bias that it introduces. This method was applied to a data set on 94 tree species in a tropical rainforest in French Guiana, using a Usher matrix model to predict species dynamics. An optimal trade-off between bias and variance was found when grouping species. Grouping species appeared to decrease the quadratic error, except when the number of groups was very small. This clustering method yielded species groups similar to those of the hierarchical cluster analysis using Ward’s method when variance was small, that is when the number of groups was small. 相似文献

7.

Multivariate Bayesian analysis of atmosphere–ocean general circulation models

Reinhard Furrer Stephan R. Sain Douglas Nychka Gerald A. Meehl 《Environmental and Ecological Statistics》2007,14(3):249-266

Numerical experiments based on atmosphere–ocean general circulation models (AOGCMs) are one of the primary tools in deriving projections for future climate change. Although each AOGCM has the same underlying partial differential equations modeling large scale effects, they have different small scale parameterizations and different discretizations to solve the equations, resulting in different climate projections. This motivates climate projections synthesized from results of several AOGCMs’ output. We combine present day observations, present day and future climate projections in a single highdimensional hierarchical Bayes model. The challenging aspect is the modeling of the spatial processes on the sphere, the number of parameters and the amount of data involved. We pursue a Bayesian hierarchical model that separates the spatial response into a large scale climate change signal and an isotropic process representing small scale variability among AOGCMs. Samples from the posterior distributions are obtained with computer-intensive MCMC simulations. The novelty of our approach is that we use gridded, high resolution data covering the entire sphere within a spatial hierarchical framework. The primary data source is provided by the Coupled Model Intercomparison Project (CMIP) and consists of 9 AOGCMs on a 2.8 by 2.8 degree grid under several different emission scenarios. In this article we consider mean seasonal surface temperature and precipitation as climate variables. Extensions to our model are also discussed. 相似文献

8.

Making more out of sparse data: hierarchical modeling of species communities 总被引：1，自引：0，他引：1

Ovaskainen O Soininen J 《Ecology》2011,92(2):289-295

Community ecologists and conservation biologists often work with data that are too sparse for achieving reliable inference with species-specific approaches. Here we explore the idea of combining species-specific models into a single hierarchical model. The community component of the model seeks for shared patterns in how the species respond to environmental covariates. We illustrate the modeling framework in the context of logistic regression and presence-absence data, but a similar hierarchical structure could also be used in many other types of applications. We first use simulated data to illustrate that the community component can improve parameterization of species-specific models especially for rare species, for which the data would be too sparse to be informative alone. We then apply the community model to real data on 500 diatom species to show that it has much greater predictive power than a collection of independent species-specific models. We use the modeling approach to show that roughly one-third of distance decay in community similarity can be explained by two variables characterizing water quality, rare species typically preferring nutrient-poor waters with high pH, and common species showing a more general pattern of resource use. 相似文献

9.

Tropical deforestation in Madagascar: analysis using hierarchical,spatially explicit,Bayesian regression models

《Ecological modelling》2005,185(1):105-131

Establishing cause–effect relationships for deforestation at various scales has proven difficult even when rates of deforestation appear well documented. There is a need for better explanatory models, which also provide insight into the process of deforestation. We propose a novel hierarchical modeling specification incorporating spatial association. The hierarchical aspect allows us to accommodate misalignment between the land-use (response) data layer and explanatory data layers. Spatial structure seems appropriate due to the inherently spatial nature of land use and data layers explaining land use. Typically, there will be missing values or holes in the response data. To accommodate this we propose an imputation strategy. We apply our modeling approach to develop a novel deforestation model for the eastern wet forested zone of Madagascar, a global rain forest “hot spot”. Using five data layers created for this region, we fit a suitable spatial hierarchical model. Though fitting such models is computationally much more demanding than fitting more standard models, we show that the resulting interpretation is much richer. Also, we employ a model choice criterion to argue that our fully Bayesian model performs better than simpler ones. To the best of our knowledge, this is the first work that applies hierarchical Bayesian modeling techniques to study deforestation processes. We conclude with a discussion of our findings and an indication of the broader ecological applicability of our modeling style. 相似文献

10.

Application of a Random Forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy

Simone Vincenzi Matteo ZucchettaPiero Franzoi Michele PellizzatoFabio Pranovi Giulio A. De LeoPatrizia Torricelli 《Ecological modelling》2011,222(8):1471-1478

We present a modelling framework that combines machine learning techniques and Geographic Information Systems to support the management of an important aquaculture species, Manila clam (Ruditapes philippinarum). We use the Venice lagoon (Italy), the first site in Europe for the production of R. philippinarum, to illustrate the potential of this modelling approach. To investigate the relationship between the yield of R. philippinarum and a set of environmental factors, we used a Random Forest (RF) algorithm. The RF model was tuned with a large data set (n = 1698) and validated by an independent data set (n = 841). Overall, the model provided good predictions of site-specific yields and the analysis of marginal effect of predictors showed substantial agreement among the modelled responses and available ecological knowledge for R. philippinarum. The most influent environmental factors for yield estimation were percentage of sand in the sediment, salinity, and water depth. Our results agree with findings from other North Adriatic lagoons. The application of the fitted RF model to continuous maps of all the environmental variables allowed estimates of the potential yield for the whole basin. Such a spatial representation enabled site-specific estimates of yield in different farming areas within the lagoon. We present a possible management application of our model by estimating the potential yield under the current farming distribution and comparing it to a proposed re-organization of the farming areas. Our analysis suggests a reduction of total yield is likely to result from the proposed re-organization. 相似文献

11.

Hierarchical Bayesian space-time models

CHRISTOPHER K. Wikle L. Mark Berliner Noel Cressie 《Environmental and Ecological Statistics》1998,5(2):117-154

Space-time data are ubiquitous in the environmental sciences. Often, as is the case with atmo- spheric and oceanographic processes, these data contain many different scales of spatial and temporal variability. Such data are often non-stationary in space and time and may involve many observation/prediction locations. These factors can limit the effectiveness of traditional space- time statistical models and methods. In this article, we propose the use of hierarchical space-time models to achieve more flexible models and methods for the analysis of environmental data distributed in space and time. The first stage of the hierarchical model specifies a measurement- error process for the observational data in terms of some 'state' process. The second stage allows for site-specific time series models for this state variable. This stage includes large-scale (e.g. seasonal) variability plus a space-time dynamic process for the anomalies'. Much of our interest is with this anomaly proc ess. In the third stage, the parameters of these time series models, which are distributed in space, are themselves given a joint distribution with spatial dependence (Markov random fields). The Bayesian formulation is completed in the last two stages by speci- fying priors on parameters. We implement the model in a Markov chain Monte Carlo framework and apply it to an atmospheric data set of monthly maximum temperature. 相似文献

12.

Zero-inflated models with application to spatial count data 总被引：1，自引：2，他引：1

Deepak K. Agarwal Alan E. Gelfand Steven Citron-Pousty 《Environmental and Ecological Statistics》2002,9(4):341-355

Count data arises in many contexts. Here our concern is with spatial count data which exhibit an excessive number of zeros. Using the class of zero-inflated count models provides a flexible way to address this problem. Available covariate information suggests formulation of such modeling within a regression framework. We employ zero-inflated Poisson regression models. Spatial association is introduced through suitable random effects yielding a hierarchical model. We propose fitting this model within a Bayesian framework considering issues of posterior propriety, informative prior specification and well-behaved simulation based model fitting. Finally, we illustrate the model fitting with a data set involving counts of isopod nest burrows for 1649 pixels over a portion of the Negev desert in Israel. 相似文献

13.

Trait-mediated effects on flowers: artificial spiders deceive pollinators and decrease plant fitness 总被引：1，自引：0，他引：1

Gonçalves-Souza T Omena PM Souza JC Romero GQ 《Ecology》2008,89(9):2407-2413

Although predators can affect foraging behaviors of floral visitors, rarely is it known if these top-down effects of predators may cascade to plant fitness through trait-mediated interactions. In this study we manipulated artificial crab spiders on flowers of Rubus rosifolius to test the effects of predation risk on flower-visiting insects and strength of trait-mediated indirect effects to plant fitness. In addition, we tested which predator traits (e.g., forelimbs, abdomen) are recognized and avoided by pollinators. Total visitation rate was higher for control flowers than for flowers with an artificial crab spider. In addition, flowers with a sphere (simulating a spider abdomen) were more frequently visited than those with forelimbs or the entire spider model. Furthermore, the presence of artificial spiders decreased individual seed set by 42% and fruit biomass by 50%. Our findings indicate that pollinators, mostly bees, recognize and avoid flowers with predation risk; forelimbs seem to be the predator trait recognized and avoided by hymenopterans. Additionally, predator avoidance by pollinators resulted in pollen limitation, thereby affecting some components of plant fitness (fruit biomass and seed number). Because most pollinator species that recognized predation risk visited many other plant species, trait-mediated indirect effects of spiders cascading down to plant fitness may be a common phenomenon in the Atlantic rainforest ecosystem. 相似文献

14.

Metropolitan Open-Space Protection with Uncertain Site Availability 总被引：6，自引：0，他引：6

ROBERT G. HAIGHT‡ STEPHANIE A. SNYDER CHARLES S. REVELLE† 《Conservation biology》2005,19(2):327-337

Abstract: Urban planners acquire open space to protect natural areas and provide public access to recreation opportunities. Because of limited budgets and dynamic land markets, acquisitions take place sequentially depending on available funds and sites. To address these planning features, we formulated a two-period site selection model with two objectives: maximize the expected number of species represented in protected sites and maximize the expected number of people with access to protected sites. These objectives were both maximized subject to an upper bound on area protected over two periods. The trade-off between species representation and public access was generated by the weighting method of multiobjective programming. Uncertainty was represented with a set of probabilistic scenarios of site availability in a linear-integer formulation. We used data for 27 rare species in 31 candidate sites in western Lake County, near the city of Chicago, to illustrate the model. Each trade-off curve had a concave shape in which species representation dropped at an increasing rate as public accessibility increased, with the trade-off being smaller at higher levels of the area budget. Several sites were included in optimal solutions regardless of objective function weights, and these core sites had high species richness and public access per unit area. The area protected in period one depended on current site availability and on the probabilities of sites being undeveloped and available in the second period. Although the numerical results are specific for our study, the methodology is general and applicable elsewhere. 相似文献

15.

Species distribution modelling—Effect of design and sample size of pseudo-absence observations

Jogeir N. Stokland Rune Halvorsen Bente Støa 《Ecological modelling》2011,222(11):1800-1809

We explored the effect of varying pseudo-absence data in species distribution modelling using empirical data for four real species and simulated data for two imaginary species. In all analyses we used a fixed study area, a fixed set of environmental predictors and a fixed set of presence observations. Next, we added pseudo-absence data generated by different sampling designs and in different numbers to assess their relative importance for the output from the species distribution model. The sampling design strongly influenced the predictive performance of the models while the number of pseudo-absences had minimal effect on the predictive performance. We attribute much of these results to the relationship between the environmental range of the pseudo-absences (i.e. the extent of the environmental space being considered) and the environmental range of the presence observations (i.e. under which environmental conditions the species occurs). The number of generated pseudo-absences had a direct effect on the predicted probability, which translated to different distribution areas. Pseudo-absence observations that fell within grid cells with presence observations were purposely included in our analyses. We discourage the practice of excluding certain pseudo-absence data because it involves arbitrary assumptions about what are (un)suitable environments for the species being modelled. 相似文献

16.

A hierarchical model for spatial capture-recapture data 总被引：1，自引：0，他引：1

Royle JA Young KV 《Ecology》2008,89(8):2281-2289

Estimating density is a fundamental objective of many animal population studies. Application of methods for estimating population size from ostensibly closed populations is widespread, but ineffective for estimating absolute density because most populations are subject to short-term movements or so-called temporary emigration. This phenomenon invalidates the resulting estimates because the effective sample area is unknown. A number of methods involving the adjustment of estimates based on heuristic considerations are in widespread use. In this paper, a hierarchical model of spatially indexed capture-recapture data is proposed for sampling based on area searches of spatial sample units subject to uniform sampling intensity. The hierarchical model contains explicit models for the distribution of individuals and their movements, in addition to an observation model that is conditional on the location of individuals during sampling. Bayesian analysis of the hierarchical model is achieved by the use of data augmentation, which allows for a straightforward implementation in the freely available software WinBUGS. We present results of a simulation study that was carried out to evaluate the operating characteristics of the Bayesian estimator under variable densities and movement patterns of individuals. An application of the model is presented for survey data on the flat-tailed horned lizard (Phrynosoma mcallii) in Arizona, USA. 相似文献

17.

Modelling spatial zero-inflated continuous data with an exponentially compound Poisson process 总被引：1，自引：1，他引：0

Sophie Ancelet Marie-Pierre Etienne Hugues Benoît Eric Parent 《Environmental and Ecological Statistics》2010,17(3):347-376

A parsimonious model is presented as an alternative to delta approaches to modelling zero-inflated continuous data. The data model relies on an exponentially compound Poisson process, also called the law of leaks (LOL). It represents the process of sampling resources that are spatially distributed as Poisson distributed patches, each containing a certain quantity of biomass drawn from an exponential distribution. In an application of the LOL, two latent structures are proposed to account for spatial dependencies between zero values at different scales within a hierarchical Bayesian framework. The LOL is compared to the delta-gamma (ΔΓ) distribution using bottom-trawl survey data. Results of this case study emphasize that the LOL provides slightly better fits to learning samples with a very high proportion of zero values and small strictly positive abundance data. Additionally, it offers better predictions of validation samples. 相似文献

18.

Mechanistic origins of variability in phytoplankton dynamics. Part II: analysis of mesocosm blooms under climate change scenarios

Kai W. Wirtz Ulrich Sommer 《Marine Biology》2013,160(9):2503-2516

Driving factors of phytoplankton spring blooms have been discussed since long, but rarely analyzed quantitatively. Here, we use a mechanistic size-based ecosystem model to reconstruct observations made during the Kiel mesocosm experiments (2005–2006). The model accurately hindcasts highly variable bloom developments including community shifts in cell size. Under low light, phytoplankton dynamics was mostly controlled by selective mesozooplankton grazing. Selective grazing also explains initial dominance of large diatoms under high light conditions. All blooms were mainly terminated by aggregation and sedimentation. Allometries in nutrient uptake capabilities led to a delayed, post-bloom dominance of small species. In general, biomass and trait dynamics revealed many mutual dependencies, while growth factors decoupled from the respective selective forces. A size shift induced by one factor often changed the growth dependency on other factors. Within climate change scenarios, these indirect effects produced large sensitivities of ecosystem fluxes to the size distribution of winter phytoplankton. These sensitivities exceeded those found for changes in vertical mixing, whereas temperature changes only had minimal impacts. 相似文献

19.

Spatial prediction of soil contamination based on machine learning: a review

Yang Zhang Mei Lei Kai Li Tienan Ju 《Frontiers of Environmental Science & Engineering》2023,17(8):93

● A review of machine learning (ML) for spatial prediction of soil contamination. ● ML have achieved significant breakthroughs for soil contamination prediction. ● A structured guideline for using ML in soil contamination is proposed. ● The guideline includes variable selection, model evaluation, and interpretation. Soil pollution levels can be quantified via sampling and experimental analysis; however, sampling is performed at discrete points with long distances owing to limited funding and human resources, and is insufficient to characterize the entire study area. Spatial prediction is required to comprehensively investigate potentially contaminated areas. Consequently, machine learning models that can simulate complex nonlinear relationships between a variety of environmental conditions and soil contamination have recently become popular tools for predicting soil pollution. The characteristics, advantages, and applications of machine learning models used to predict soil pollution are reviewed in this study. Satisfactory model performance generally requires the following: 1) selection of the most appropriate model with the required structure; 2) selection of appropriate independent variables related to pollutant sources and pathways to improve model interpretability; 3) improvement of model reliability through comprehensive model evaluation; and 4) integration of geostatistics with the machine learning model. With the enrichment of environmental data and development of algorithms, machine learning will become a powerful tool for predicting the spatial distribution and identifying sources of soil contamination in the future. 相似文献

20.

New approaches to modelling fish-habitat relationships 总被引：1，自引：0，他引：1

Anders Knudby Alexander Brenning Ellsworth LeDrew 《Ecological modelling》2010,221(3):503-511

Ecologists often develop models that describe the relationship between faunal communities and their habitat. Coral reef fishes have been the focus of numerous such studies, which have used a wide range of statistical tools to answer an equally wide range of questions. Here, we apply a series of both conventional statistical techniques (linear and generalized additive regression models) and novel machine-learning techniques (the support vector machine and three ensemble techniques used with regression trees) to predict fish species richness, biomass, and diversity from a range of habitat variables. We compare the techniques in terms of their predictive performance, and we compare a subset of the models in terms of the influence each habitat variable has for the predictions. Prediction errors are estimated by cross-validation, and variable importance is assessed using permutations of individual variable values. For predictions of species richness and diversity the tree-based models generally and the random forest model specifically are superior (produce the lowest errors). These model types are all able to model both nonlinear and interaction effects. The linear model, unable to model either effect type, performs the worst (produces the highest errors). For predictions of biomass, the generalized additive model is superior, and the support vector machine performs the worst. Depth range, the difference between maximum and minimum water depth at a given site, is identified as the most important variable in the majority of models predicting the three fish community variables. However, variable importance is highly dependent upon model type, which leads to questions regarding the interpretation of variable importance and its proper use as an indicator of causality. The representation of ecological relationships by tree-based ensemble learners will improve predictive performance, and provide a new avenue for exploring ecological relationships, both statistical and causal. 相似文献