首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 377 毫秒
1.
We propose a novel tool for testing hypotheses concerning the adequacy of environmentally defined factors for local clustering of diseases, through the comparative evaluation of the significance of the most likely clusters detected under maps whose neighborhood structures were modified according to those factors. A multi-objective genetic algorithm scan statistic is employed for finding spatial clusters in a map divided in a finite number of regions, whose adjacency is defined by a graph structure. This cluster finder maximizes two objectives, the spatial scan statistic and the regularity of cluster shape. Instead of specifying locations for the possible clusters a priori, as is currently done for cluster finders based on focused algorithms, we alter the usual adjacency induced by the common geographical boundary between regions. In our approach, the connectivity between regions is reinforced or weakened, according to certain environmental features of interest associated with the map. We build various plausible scenarios, each time modifying the adjacency structure on specific geographic areas in the map, and run the multi-objective genetic algorithm for selecting the best cluster solutions for each one of the selected scenarios. The statistical significances of the most likely clusters are estimated through Monte Carlo simulations. The clusters with the lowest estimated p-values, along with their corresponding maps of enhanced environmental features, are displayed for comparative analysis. Therefore the probability of cluster detection is increased or decreased, according to changes made in the adjacency graph structure, related to the selection of environmental features. The eventual identification of the specific environmental conditions which induce the most significant clusters enables the practitioner to accept or reject different hypotheses concerning the relevance of geographical factors. Numerical simulation studies and an application for malaria clusters in Brazil are presented.  相似文献   

2.
Air quality in an urban atmosphere is regulated by both local and distant emission sources. For air quality management in urban areas, identification of sources and their relationships with local meteorology and air pollutants are essential. The critical condition of air quality in Indo-Gangetic plain is well known, but lack of data on both local and distant emission sources limits the scope of improving air quality in this region. Concentrations of particulate matter of size lower than 10 μm (PM10) were assessed in the highly urbanized Varanasi city situated in middle Indo-Gangetic plain of India from 2014 to 2017, to identify the distant air pollution sources based on trajectory statistical models and local sources by conditional bivariate probability function. Modifying effects of meteorology and air pollutants on PM10 were also explored. Mean PM10 concentration for the study period was 244.8 ± 135.8 μg m?3, which was 12 times higher than the WHO annual guideline. Several distinct sources of traffic as the major source of PM10 were identified in the city. Trajectory statistical models like cluster analysis, potential source contribution function and concentration-weighted trajectory showed significant contributions from north-west and eastern directions in the transport of polluted air masses to the city. Dew point, wind speed, temperature and ventilation coefficient are the major factors in PM10 formation and dispersion.  相似文献   

3.
The increasing availability of digital photographic materials has fueled efforts by agencies and organizations to generate land cover maps for states, regions, and the United States as a whole. Regardless of the information sources and classification methods used, land cover maps are subject to numerous sources of error. In order to understand the quality of the information contained in these maps, it is desirable to generate statistically valid estimates of accuracy rates describing misclassification errors. We explored a full sample survey framework for creating accuracy assessment study designs that balance statistical and operational considerations in relation to study objectives for a regional assessment of GAP land cover maps. We focused not only on appropriate sample designs and estimation approaches, but on aspects of the data collection process, such as gaining cooperation of land owners and using pixel clusters as an observation unit. The approach was tested in a pilot study to assess the accuracy of Iowa GAP land cover maps. A stratified two-stage cluster sampling design addressed sample size requirements for land covers and the need for geographic spread while minimizing operational effort. Recruitment methods used for private land owners yielded high response rates, minimizing a source of nonresponse error. Collecting data for a 9-pixel cluster centered on the sampled pixel was simple to implement, and provided better information on rarer vegetation classes as well as substantial gains in precision relative to observing data at a single-pixel.  相似文献   

4.
The geographic delineation of irregularly shaped spatial clusters is an ill defined problem. Whenever the spatial scan statistic is used, some kind of penalty correction needs to be used to avoid clusters’ excessive irregularity and consequent reduction of power of detection. Geometric compactness and non-connectivity regularity functions have been recently proposed as corrections. We present a novel internal cohesion regularity function based on the graph topology to penalize the presence of weak links in candidate clusters. Weak links are defined as relatively unpopulated regions within a cluster, such that their removal disconnects it. By applying this weak link cohesion function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. A multi-objective genetic algorithm (MGA) has been proposed recently to compute the Pareto-sets of clusters solutions, employing Kulldorff’s spatial scan statistic and the geometric correction as objective functions. We propose novel MGAs to maximize the spatial scan, the cohesion function and the geometric function, or combinations of these functions. Numerical tests show that our proposed MGAs has high power to detect elongated clusters, and present good sensitivity and positive predictive value. The statistical significance of the clusters in the Pareto-set are estimated through Monte Carlo simulations. Our method distinguishes clearly those geographically inadequate clusters which are worse from both geometric and internal cohesion viewpoints. Besides, a certain degree of irregularity of shape is allowed provided that it does not impact internal cohesion. Our method has better power of detection for clusters satisfying those requirements. We propose a more robust definition of spatial cluster using these concepts.  相似文献   

5.
Identifying the major sources contributing to air pollution is a problem of fundamental importance in developing effective air quality management plans. Multivariate receptor modeling aims to achieve this goal by unfolding the air pollution data into components associated with different sources based on factor analysis models. We analyze the PM10 data obtained from 17 monitoring sites in Seoul to locate the major source regions using multivariate receptor modeling. The model uncertainty caused by the unknown number of sources and identifiability conditions is assessed by posterior probability of each model. The estimated source spatial profiles seem to be consistent with our prior expectation about the PM10 sources in Seoul.  相似文献   

6.
We introduce a new approach to diffusion-source estimation for quick identification of the unknown source, based on Taylor’s diffusion theory for turbulent transport of passive scalar from a fixed point source. In order to evaluate the method, we used planar laser-induced fluorescence to measure the concentration field of fluorescent dye in water flowing in a channel. We considered two kinds of datasets: basis data and observed data. The former is used to determine the basis functions characterizing the streamwise dependence of variances for three statistics: the mean concentration, root-mean-square (RMS) of fluctuations in the concentration, and RMS of the temporal gradient of the fluctuating concentration. Consistent with Taylor’s theory, we found that the lateral distribution of each statistic was basically Gaussian, and their standard deviations increased as a function of the square root of the distance from the emitted point. Based on these facts, a basis function can be formulated and expected to be valid for estimation of unknown sources. Source estimation was performed with the observed data, which corresponded to limited available information about the concentration from an unknown point source. We confirmed a good prediction accuracy of the proposed method with an averaged bias as small as the turbulent integral scale. Better precision was achieved by employing several statistics simultaneously. In this case, the standard deviation of the estimated source position was assessed at 14 % of the mean distance between the source and measurement points, after 100 source-estimate trials with different datasets. The methodology tested in this paper is expected to be applicable more general and complex environmental diffusion issues involving anisotropic turbulent dispersion, and space–time variable mainstream systems; but its versatility in such systems is currently under investigation.  相似文献   

7.
To predict macrofaunal community composition from environmental data a two-step approach is often followed: (1) the water samples are clustered into groups on the basis of the macrofauna data and (2) the groups are related to the environmental data, e.g. by discriminant analysis. For the cluster analysis in step 1 many hard, seemingly arbitrary choices have to be made that nevertheless influence the solution (similarity measure, clustering strategy, number of clusters). The stability of the solution is often of concern, e.g. in clustering by the program. In the discriminant analysis of step 2 it can occur that a water sample is misclassified on the basis of the environmental data but on further inspection happens to be a borderline case in the cluster analysis. One would then rather reclassify such a sample and iterate the two steps. Bayesian latent class analysis is a flexible, extendable model-based cluster analysis approach that recently has gained popularity in the statistical literature and that has the potential to address these problems. It allows the macrofauna and environmental data to be modelled and analyzed in a single integrated analysis. An exciting extension is to incorporate in the analysis prior information on the habitat preferences of the macrofauna taxa such as is available in lists of indicator values. The output of the analysis is not a hard assignment of water samples to clusters but a probabilistic (fuzzy) assignment. The number of clusters is determined on the basis of the Bayes factor. A standard feature of the Bayesian method is to make predictions and to assess their uncertainty. We applied this approach to a data set consisting of 70 water samples, 484 macrofauna taxa and four environmental variables for which previously a five cluster solution had been proposed. The standard for Bayesian estimation, the Gibbs sampler, worked fine on a subset with only 12 selected taxa but did not converge on the full set with 484 taxa. This is due to many configurations in which the assignment probabilities are all very close to either 0 or 1. This convergence problem is comparable with the local optima problem in classical cluster optimization algorithms, including the EM algorithm used in Latent Gold, a Windows program for latent class analysis. The convergence problem needs to be solved before the benefits of Bayesian latent class analysis can come to fruition in this application. We discuss possible solutions.  相似文献   

8.
Model based grouping of species across environmental gradients   总被引:1,自引:0,他引:1  
We present a novel approach to the statistical analysis and prediction of multispecies data. The approach allows the simultaneous grouping and quantification of multiple species’ responses to environmental gradients. The underlying statistical model is a finite mixture model, where mixing is performed over the individual species’ responses to environmental gradients. Species with similar responses are grouped with minimal information loss. We term these groups species archetypes. Each species archetype has an associated GLM that can be used to predict distributions with appropriate measures of uncertainty. Initially, we illustrate the concept and method using artificial data and then with application to real data comprising 200 species from the Great Barrier Reef (GBR) lagoon on 13 oceanographic and geological gradients from 12°S to 24°S. The 200 species from the GBR are well represented by 15 species archetypes. The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area. Maps of uncertainty are also produced to provide statistical context. The presence of each species archetype was strongly influenced by oceanographic gradients, principally temperature, oxygen and salinity. The number of species in each group ranged from 4 to 34. The method has potential application to the analysis of multispecies distribution patterns and for multispecies management.  相似文献   

9.
Evolutionary improvements in Geographic Information Systems (GIS) now routinely allow the management and mapping of spatial-temporal information. In response, the development of statistical models to combine information of different types and spatial support is of vital importance to environmental science. In this paper we develop a hierarchical spatial statistical model for environmental indicators of stream and river systems in the United States Mid-Atlantic Region by combining information from separate monitoring surveys, available contextual information on hydrologic units and remote sensing information. These models are used to estimate the indicators throughout the riverine system based on information from multiple sources and aggregate scales. The analysis is based on information underlying the Landscape Atlas of the mid-Atlantic region produced by the US Environmental Monitoring and Assessment Program (EMAP). We also combine information from two overlapping separate monitoring surveys, the EMAP Stream and River Survey and the Maryland Biological Streams Survey. We present a general framework for comparative distributional analysis based on the concept of a relative spatial distribution. As an application, the spatial model is used to predict spatial distributions and relative spatial distributions for a watershed.  相似文献   

10.
Many statistical tests have been developed to assess the significance of clusters of disease located around known sources of environmental contaminants, also known as focused disease clusters. The majority of focused-cluster tests were designed to detect a particular spatial pattern of clustering, one in which the disease cluster centers around the pollution source and declines in a radial fashion with distance. However, other spatial patterns of environmentally related disease clusters are likely given that the spatial dispersion patterns of environmental contaminants, and thus human exposure, depend on a number of factors (i.e., meteorology and topography). For this study, data were simulated with five different spatial patterns of disease clusters, reflecting potential pollutant dispersion scenarios: (1) a radial effect decreasing with increasing distance, (2) a radial effect with a defined peak and decreasing with distance, (3) a simple angular effect, (4) an angular effect decreasing with increasing distance and (5) an angular effect with a defined peak and decreasing with distance. The power to detect each type of spatially distributed disease cluster was evaluated using Stone’s Maximum Likelihood Ratio Test, Tango’s Focused Test, Bithell’s Linear Risk Score Test, and variations of the Lawson–Waller Score Test. Study findings underscore the importance of considering environmental contaminant dispersion patterns, particularly directional effects, with respect to focused-cluster test selection in cluster investigations. The effect of extra variation in risk also is considered, although its effect is not substantial in terms of the power of tests.  相似文献   

11.
A productive way forward in studies of animal populations is to efficiently make use of all the information available, either as raw data or as published sources, on critical parameters of interest. In this study, we demonstrate two approaches to the use of multiple sources of information on a parameter of fundamental interest to ecologists: animal density. The first approach produces estimates simultaneously from two different sources of data. The second approach was developed for situations in which initial data collection and analysis are followed up by subsequent data collection and prior knowledge is updated with new data using a stepwise process. Both approaches are used to estimate density of a rare and elusive predator, the tiger, by combining photographic and fecal DNA spatial capture-recapture data. The model, which combined information, provided the most precise estimate of density (8.5 +/- 1.95 tigers/100 km2 [posterior mean +/- SD]) relative to a model that utilized only one data source (photographic, 12.02 +/- 3.02 tigers/100 km2 and fecal DNA, 6.65 +/- 2.37 tigers/100 km2). Our study demonstrates that, by accounting for multiple sources of available information, estimates of animal density can be significantly improved.  相似文献   

12.
Information about food sources can be crucial to the success of a foraging animal. We predict that this will influence foraging decisions by group-living foragers, which may sacrifice short-term foraging efficiency to collect information more frequently. This result emerges from a model of a central-place forager that can potentially receive information on newly available superior food sources at the central place. Such foragers are expected to return early from food sources, even with just partial loads, if information about the presence of sufficiently valuable food sources is likely to become available. Returning with an incomplete load implies that the forager is at that point not achieving the maximum possible food delivery rate. However, such partial loading can be more than compensated for by an earlier exploitation of a superior food source. Our model does not assume cooperative foraging and could thus be used to investigate this effect for any social central-place forager. We illustrate the approach using numerical calculations for honeybees and leafcutter ants, which do forage cooperatively. For these examples, however, our results indicate that reducing load confers minimal benefits in terms of receiving information. Moreover, the hypothesis that foragers reduce load to give information more quickly (rather than to receive it) fits empirical data from social insects better. Thus, we can conclude that in these two cases of social-insect foraging, efficient distribution of information by successful foragers may be more important than efficient collection of information by unsuccessful ones.  相似文献   

13.
Conservation of coastal lands reduces non-point source pollution loads into oceans and estuaries, retains natural areas and saves ecological communities from disappearance and change. A recent agreement for protection of Long Island Sound waters in New York and Connecticut established 30 environmental and management goals. One of them is establishment of a listing of existing undeveloped properties and their prioritization for natural resource conservation and outdoor recreation. The optimal prioritization approach poses strong constraints and methodological challenges on selection of data for analysis, assignment of a priority score to each property unit and the assessment of this assignment. To be a practical tool, the prioritization model should be reproducible and include a mechanism for evaluation of obtained prioritization scenarios. Presented study uses Geographic Information System (GIS) to assign conservation priority scores to unprotected and undeveloped parcels greater than five acres in size within New York’s Long Island Sound coastal area. The method combines spatial multi-criteria analysis and statistical methods. The results of this project include identification and prioritization of more than 700 undeveloped properties on New York coast. The most important finding of GIS analysis was the discovery of clusters of vacant parcels that together form large areas available for future conservation. These results offer new conservation tools and strategies to coastal managers and government in New York State.  相似文献   

14.
Summary (1) When a honey bee follows recruitment dances to locate a new food source, does she sample multiple dances representing different food sources and selectively respond to the strongest dance? (2) Several initial findings suggested that foragers might indeed compare dances. First, dance information is arrayed in the hive in a way that facilitates comparison-making: dances for different flower patches are performed close together in time and space. Second, food-source quality is coded in the dances, in terms of dance length (number of circuits per dance). Third, dances to natural food sources vary in length by more than 2 orders of magnitude, indicating that the quality of natural food sources varies greatly. Fourth, foragers seeking a new food source follow several dances before exiting the hive (though only one dance is followed closely). (3) Nevertheless, a critical test for comparison-making revealed that foragers evidently do not compare dances. A colony was given two feeders that were equidistant from the hive but different in profitability. If foragers do not compare dances, then the proportion of recruits arriving at the richer feeder should match the proportion of dance circuits for the richer feeder. This is the pattern that we found in all 11 trials of the experiment. (4) We suggest that the reason foragers do not compare dances is that a colony's foraging success is greater if its foragers distribute themselves among the various food sources being advertised in the hive than if they crowd themselves on the one, best source. (5) Food-source selection by honey bee colonies is a democratic decision-making process. This study reveals that this selection process is organized to function effectively even though each member of the democracy possesses incomplete information about the available choices. Offprint requests to: T.D. Seeley  相似文献   

15.
分析了天津地区表层土壤和具有代表性的河流沉积物中甾烷和五环三萜烷系列化合物的组成与分布特征,讨论了这类化合物的来源及环境意义。分析表明,在天津地区不同环境功能区表层土壤和河流沉积物中均检测到了甾烷和五环三萜烷系列化合物,但样品之间其含量存在明显的差别;表层土壤与河流沉积物中甾烷、五环三萜烷化合物的组成与石油中的基本一致,样品间这类化合物在饱和烃中的相对含量与正烷烃CPI之间有较好的负相关关系,表明它们主要来源于石油及副产品,可以根据样品中甾、萜化合物与正构烷烃的比值来反映饱和烃中石油烃污染源的贡献。论文根据样品中甾烷、五环三萜烷化合物与正构烷烃的比值,并结合正烷烃CPI参数初步分析了天津地区不同环境功能区表层土壤和河流沉积物中饱和烃污染物的污染源。  相似文献   

16.
● A hydrodynamic-Bayesian inference model was developed for water pollution tracking. ● Model is not stuck in local optimal solutions for high-dimensional problem. ● Model can estimate source parameters accurately with known river water levels. ● Both sudden spill incident and normal sewage inputs into the river can be tracked. ● Model is superior to the traditional approaches based on the test cases. Water quality restoration in rivers requires identification of the locations and discharges of pollution sources, and a reliable mathematical model to accomplish this identification is essential. In this paper, an innovative framework is presented to inversely estimate pollution sources for both accident preparedness and normal management of the allowable pollutant discharge. The proposed model integrates the concepts of the hydrodynamic diffusion wave equation and an improved Bayesian-Markov chain Monte Carlo method (MCMC). The methodological framework is tested using a designed case of a sudden wastewater spill incident (i.e., source location, flow rate, and starting and ending times of the discharge) and a real case of multiple sewage inputs into a river (i.e., locations and daily flows of sewage sources). The proposed modeling based on the improved Bayesian-MCMC method can effectively solve high-dimensional search and optimization problems according to known river water levels at pre-set monitoring sites. It can adequately provide accurate source estimation parameters using only one simulation through exploration of the full parameter space. In comparison, the inverse models based on the popular random walk Metropolis (RWM) algorithm and microbial genetic algorithm (MGA) do not produce reliable estimates for the two scenarios even after multiple simulation runs, and they fall into locally optimal solutions. Since much more water level data are available than water quality data, the proposed approach also provides a cost-effective solution for identifying pollution sources in rivers with the support of high-frequency water level data, especially for rivers receiving significant sewage discharges.  相似文献   

17.
Benstead JP  March JG  Fry B  Ewel KC  Pringle CM 《Ecology》2006,87(2):326-333
We sampled consumers and organic matter sources (mangrove litter, freshwater swamp-forest litter, seagrasses, seagrass epiphytes, and marine particulate organic matter [MPOM]) from four estuaries on Kosrae, Federated States of Micronesia for stable isotope (sigma13C and sigma34S) analysis. Unique mixing solutions cannot be calculated in a dual-isotope, five-endmember scenario, so we tested IsoSource, a recently developed statistical procedure that calculates ranges in source contributions (i.e., minimum and maximum possible). Relatively high minimum contributions indicate significant sources, while low maxima indicate otherwise. Litter from the two forest types was isotopically distinguishable but had low average minimum contributions (0-8% for mangrove litter and 0% for swamp-forest litter among estuaries). Minimum contribution of MPOM was also low, averaging 0-13% among estuaries. Instead, local marine sources dominated contributions to consumers. Minimum contributions of seagrasses averaged 8-47% among estuaries (range 0-88% among species). Minimum contributions of seagrass epiphytes averaged 5-27% among estuaries (range 0-69% among species). IsoSource enabled inclusion of five organic matter sources in our dual-isotope analysis, ranking trophic importance as follows: seagrasses > seagrass epiphytes > MPOM > mangrove forest > freshwater swamp-forest. IsoSource is thus a useful step toward understanding which of multiple organic matter sources support food webs; more detailed work is necessary to identify unique solutions.  相似文献   

18.
We present a multivariate receptor model for identifying the spatial location of major PM10 pollution sources through the concentrations at multiple monitoring stations. We build on a mixed multiplicative log-normal factor model adjusting the source contributions for meteorological covariates and for temporal correlation and considering source profiles as compositional Gaussian random fields, to account for the variability induced by the spatial distribution of the monitoring sites. Taking a Bayesian approach to estimation, the proposed hierarchical model is implemented and used to analyze average daily PM10 concentration measurements from 13 monitoring sites in Taranto, Italy, for the period April–December 2005. Three major sources of pollution are identified and characterized in terms of their spatial and temporal behavior and in relation to meteorological data.  相似文献   

19.
Surrogate approaches are widely used to estimate overall taxonomic diversity for conservation planning. Surrogate taxa are frequently selected based on rarity or charisma, whereas selection through statistical modeling has been applied rarely. We used boosted‐regression‐tree models (BRT) fitted to biological data from 165 springs to identify bryophyte and invertebrate surrogates for taxonomic and functional diversity of boreal springs. We focused on these 2 groups because they are well known and abundant in most boreal springs. The best indicators of taxonomic versus functional diversity differed. The bryophyte Bryum weigelii and the chironomid larva Paratrichocladius skirwithensis best indicated taxonomic diversity, whereas the isopod Asellus aquaticus and the chironomid Macropelopia spp. were the best surrogates of functional diversity. In a scoring algorithm for priority‐site selection, taxonomic surrogates performed only slightly better than random selection for all spring‐dwelling taxa, but they were very effective in representing spring specialists, providing a distinct improvement over random solutions. However, the surrogates for taxonomic diversity represented functional diversity poorly and vice versa. When combined with cross‐taxon complementarity analyses, surrogate selection based on statistical modeling provides a promising approach for identifying groundwater‐dependent ecosystems of special conservation value, a key requirement of the EU Water Framework Directive.  相似文献   

20.
A new statistical testing approach using a weighted logrank statistic is developed for rodent tumorigenicity assays that have a single terminal sacrifice but not cause-of-death data. Instead of using cause-of-death assignment by pathologists, the number of fatal tumors is estimated by a constrained nonparametric maximum likelihood estimation method. For data lacking cause-of-death information, the Peto test is modified with estimated numbers of fatal tumors and a Fleming–Harrington-type weight, which is based on an estimated tumor survival function. A bootstrap resampling method is used to estimate the weight function. The proposed testing method with the weight adjustment appears to improve the performance in various situations of single-sacrifice animal experiments. A Monte Carlo simulation study for the proposed test is conducted to assess size and power of the test. This testing approach is illustrated using a real data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号