首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
The spatial scan statistic is a widely applied tool for cluster detection. The spatial scan statistic evaluates the significance of a series of potential circular clusters using Monte Carlo simulation to account for the multiplicity of comparisons. In most settings, the extent of the multiplicity problem varies across the study region. For example, urban areas typically have many overlapping clusters, while rural areas have few. The spatial scan statistic does not account for these local variations in the multiplicity problem. We propose two new spatially-varying multiplicity adjustments for spatial cluster detection, one based on a nested Bonferroni adjustment and one based on local averaging. Geographic variations in power for the spatial scan statistic and the two new statistics are explored through simulation studies, and the methods are applied to both the well-known New York leukemia data and data from a case–control study of breast cancer in Wisconsin.  相似文献   

2.
The geographic delineation of irregularly shaped spatial clusters is an ill defined problem. Whenever the spatial scan statistic is used, some kind of penalty correction needs to be used to avoid clusters’ excessive irregularity and consequent reduction of power of detection. Geometric compactness and non-connectivity regularity functions have been recently proposed as corrections. We present a novel internal cohesion regularity function based on the graph topology to penalize the presence of weak links in candidate clusters. Weak links are defined as relatively unpopulated regions within a cluster, such that their removal disconnects it. By applying this weak link cohesion function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. A multi-objective genetic algorithm (MGA) has been proposed recently to compute the Pareto-sets of clusters solutions, employing Kulldorff’s spatial scan statistic and the geometric correction as objective functions. We propose novel MGAs to maximize the spatial scan, the cohesion function and the geometric function, or combinations of these functions. Numerical tests show that our proposed MGAs has high power to detect elongated clusters, and present good sensitivity and positive predictive value. The statistical significance of the clusters in the Pareto-set are estimated through Monte Carlo simulations. Our method distinguishes clearly those geographically inadequate clusters which are worse from both geometric and internal cohesion viewpoints. Besides, a certain degree of irregularity of shape is allowed provided that it does not impact internal cohesion. Our method has better power of detection for clusters satisfying those requirements. We propose a more robust definition of spatial cluster using these concepts.  相似文献   

3.
The scan statistic is widely used in spatial cluster detection applications of inhomogeneous Poisson processes. However, real data may present substantial departure from the underlying Poisson process. One of the possible departures has to do with zero excess. Some studies point out that when applied to data with excess zeros, the spatial scan statistic may produce biased inferences. In this work, we develop a closed-form scan statistic for cluster detection of spatial zero-inflated count data. We apply our methodology to simulated and real data. Our simulations revealed that the Scan-Poisson statistic steadily deteriorates as the number of zeros increases, producing biased inferences. On the other hand, our proposed Scan-ZIP and Scan-ZIP+EM statistics are, most of the time, either superior or comparable to the Scan-Poisson statistic.  相似文献   

4.
We propose a novel tool for testing hypotheses concerning the adequacy of environmentally defined factors for local clustering of diseases, through the comparative evaluation of the significance of the most likely clusters detected under maps whose neighborhood structures were modified according to those factors. A multi-objective genetic algorithm scan statistic is employed for finding spatial clusters in a map divided in a finite number of regions, whose adjacency is defined by a graph structure. This cluster finder maximizes two objectives, the spatial scan statistic and the regularity of cluster shape. Instead of specifying locations for the possible clusters a priori, as is currently done for cluster finders based on focused algorithms, we alter the usual adjacency induced by the common geographical boundary between regions. In our approach, the connectivity between regions is reinforced or weakened, according to certain environmental features of interest associated with the map. We build various plausible scenarios, each time modifying the adjacency structure on specific geographic areas in the map, and run the multi-objective genetic algorithm for selecting the best cluster solutions for each one of the selected scenarios. The statistical significances of the most likely clusters are estimated through Monte Carlo simulations. The clusters with the lowest estimated p-values, along with their corresponding maps of enhanced environmental features, are displayed for comparative analysis. Therefore the probability of cluster detection is increased or decreased, according to changes made in the adjacency graph structure, related to the selection of environmental features. The eventual identification of the specific environmental conditions which induce the most significant clusters enables the practitioner to accept or reject different hypotheses concerning the relevance of geographical factors. Numerical simulation studies and an application for malaria clusters in Brazil are presented.  相似文献   

5.
Upper level set scan statistic for detecting arbitrarily shaped hotspots   总被引:2,自引:0,他引:2  
A declared need is around for geoinformatic surveillance statistical science and software infrastructure for spatial and spatiotemporal hotspot detection. Hotspot means something unusual, anomaly, aberration, outbreak, elevated cluster, critical resource area, etc. The declared need may be for monitoring, etiology, management, or early warning. The responsible factors may be natural, accidental, or intentional. This proof-of-concept paper suggests methods and tools for hotspot detection across geographic regions and across networks. The investigation proposes development of statistical methods and tools that have immediate potential for use in critical societal areas, such as public health and disease surveillance, ecosystem health, water resources and water services, transportation networks, persistent poverty typologies and trajectories, environmental justice, biosurveillance and biosecurity, among others. We introduce, for multidisciplinary use, an innovation of the health-area-popular circle-based spatial and spatiotemporal scan statistic. Our innovation employs the notion of an upper level set, and is accordingly called the upper level set scan statistic, pointing to a sophisticated analytical and computational system as the next generation of the present day popular SaTScan. Success of surveillance rests on potential elevated cluster detection capability. But the clusters can be of any shape, and cannot be captured only by circles. This is likely to give more of false alarms and more of false sense of security. What we need is capability to detect arbitrarily shaped clusters. The proposed upper level set scan statistic innovation is expected to fill this need  相似文献   

6.
Routine surveillance of a large geographic region for clusters of adverse health events, particularly cancers, often involves small area health data, possibly controlling for exposure information. Many different methods have been proposed to test for the presence of geographical clusters. Two of the most popular methods are the spatial scan method proposed by Kulldorff and that using a fixed number of cases within scanning circles proposed by Besag and Newell. Although the second test is very popular, it has some difficulties. While the scan test controls for the multiple testing problem, the Besag and Newell test does not. Additionally, the latter method requires the setting of several tuning parameters whose values affect the test performance and are subjectively chosen by the user. This creates a difficulty to make a fair comparison between the two methods and it explains why there have been few formal studies evaluating their relative performances. In this paper, we modify the Besag and Newell test allowing for the control of the error type I probability and compare its power with respect to that of the spatial scan test. We used data sets from a publicly available simulated benchmark. We found that the two methods have similar results, except for clusters located in sparsely populated regions, where the spatial scan method presented a better performance.  相似文献   

7.
This paper extends the spatial local-likelihood model and the spatial mixture model to the space-time (ST) domain. For comparison, a standard random effect space-time (SREST) model is examined to allow evaluation of each model’s ability in relation to cluster detection. To pursue this evaluation, we use the ST counterparts of spatial cluster detection diagnostics. The proposed criteria are based on posterior estimates (e.g., misclassification rate) and some are based on post-hoc analysis of posterior samples (e.g., exceedance probability). In addition, we examine more conventional model fit criteria including mean square error (MSE). We illustrate the methodology with a real ST dataset, Georgia throat cancer mortality data for the years 1994–2005, and a simulated dataset where different levels and shapes of clusters are embedded. Overall, it is found that conventional SREST models fair well in ST cluster detection and in goodness-of-fit, while for extreme risk detection the local likelihood ST model does best.  相似文献   

8.
9.
Whether general environmental exposures to endocrine disrupting chemicals (including pesticides and dioxin) might induce decreased sex ratios (male/female ratio at birth) is discussed. To address this issue, the authors looked for a space-time clustering test which could detect local areas of significantly low risk, assuming a Bernoulli distribution. As a matter of fact, if the endocrine disruptor hypothesis holds true, and if the sex ratio is a sentinel health event indicative of new reproductive hazards ascribed to environmental factors, then in a given region, either a cluster of low male/female ratio among newborn babies would be expected in the vicinity of polluting municipal solid waste incinerators (MSWIs) (supporting the dioxin hypothesis), or local clusters would be expected in some rural areas where large amounts of pesticides are sprayed. Among cluster detection tests, the spatial scan statistic has been widely used in various applications to scan for areas with high rates, and rarely (if ever) with low rates. Therefore, the goal of this paper was to check the properties of the scan statistics under a given scenario (Bernoulli distribution, search for clusters with low rates) and to assess its added value in addressing the sex ratio issue. This study took place in the Franche-Comté region (France), mainly rural, comprising three main MSWIs, among which only one had high dioxin emissions level in the past. The study population consisted of 192,490 boys and 182,588 girls born during the 1975–1999 period. On the whole, the authors conclude that: (i) spatial and space-time scan statistics provide attractive features to address the sex ratio issue; (ii) sex ratio is not markedly affected across space and does not provide a reliable screening measure for detecting reproductive hazards ascribed to environmental factors.  相似文献   

10.
To predict macrofaunal community composition from environmental data a two-step approach is often followed: (1) the water samples are clustered into groups on the basis of the macrofauna data and (2) the groups are related to the environmental data, e.g. by discriminant analysis. For the cluster analysis in step 1 many hard, seemingly arbitrary choices have to be made that nevertheless influence the solution (similarity measure, clustering strategy, number of clusters). The stability of the solution is often of concern, e.g. in clustering by the program. In the discriminant analysis of step 2 it can occur that a water sample is misclassified on the basis of the environmental data but on further inspection happens to be a borderline case in the cluster analysis. One would then rather reclassify such a sample and iterate the two steps. Bayesian latent class analysis is a flexible, extendable model-based cluster analysis approach that recently has gained popularity in the statistical literature and that has the potential to address these problems. It allows the macrofauna and environmental data to be modelled and analyzed in a single integrated analysis. An exciting extension is to incorporate in the analysis prior information on the habitat preferences of the macrofauna taxa such as is available in lists of indicator values. The output of the analysis is not a hard assignment of water samples to clusters but a probabilistic (fuzzy) assignment. The number of clusters is determined on the basis of the Bayes factor. A standard feature of the Bayesian method is to make predictions and to assess their uncertainty. We applied this approach to a data set consisting of 70 water samples, 484 macrofauna taxa and four environmental variables for which previously a five cluster solution had been proposed. The standard for Bayesian estimation, the Gibbs sampler, worked fine on a subset with only 12 selected taxa but did not converge on the full set with 484 taxa. This is due to many configurations in which the assignment probabilities are all very close to either 0 or 1. This convergence problem is comparable with the local optima problem in classical cluster optimization algorithms, including the EM algorithm used in Latent Gold, a Windows program for latent class analysis. The convergence problem needs to be solved before the benefits of Bayesian latent class analysis can come to fruition in this application. We discuss possible solutions.  相似文献   

11.
《Ecological modelling》2005,185(1):13-27
This paper describes an approach for conducting spatial uncertainty analysis of spatial population models, and illustrates the ecological consequences of spatial uncertainty for landscapes with different properties. Spatial population models typically simulate birth, death, and migration on an input map that describes habitat. Typically, only a single “reference” map is available, but we can imagine that a collection of other, slightly different, maps could be drawn to represent a particular species’ habitat. As a first approximation, our approach assumes that spatial uncertainty (i.e., the variation among values assigned to a location by such a collection of maps) is constrained by characteristics of the reference map, regardless of how the map was produced. Our approach produces lower levels of uncertainty than alternative methods used in landscape ecology because we condition our alternative landscapes on local properties of the reference map. Simulated spatial uncertainty was higher near the borders of patches. Consequently, average uncertainty was highest for reference maps with equal proportions of suitable and unsuitable habitat, and no spatial autocorrelation. We used two population viability models to evaluate the ecological consequences of spatial uncertainty for landscapes with different properties. Spatial uncertainty produced larger variation among predictions of a spatially explicit model than those of a spatially implicit model. Spatially explicit model predictions of final female population size varied most among landscapes with enough clustered habitat to allow persistence. In contrast, predictions of population growth rate varied most among landscapes with only enough clustered habitat to support a small population, i.e., near a spatially mediated extinction threshold. We conclude that spatial uncertainty has the greatest effect on persistence when the amount and arrangement of suitable habitat are such that habitat capacity is near the minimum required for persistence.  相似文献   

12.
Runge JP  Hines JE  Nichols JD 《Ecology》2007,88(2):282-288
Incorporating uncertainty in the investigation of ecological studies has been the topic of an increasing body of research. In particular, mark-recapture methodology has shown that incorporating uncertainty in the probability of detecting individuals in populations enables accurate estimation of population-level processes such as survival, reproduction, and dispersal. Recent advances in mark-recapture methodology have included estimating population-level processes for biologically important groups despite the misassignment of individuals to those groups. Examples include estimating rates of apparent survival despite less than perfect accuracy when identifying individuals to gender or breeding state. Here we introduce a method for estimating apparent survival and dispersal in species that co-occur but that are difficult to distinguish. We use data from co-occurring populations of meadow voles (Microtus pennsylvanicus) and montane voles (M. montanus) in addition to simulated data to show that ignoring species uncertainty can lead to biased estimates of population processes. The incorporation of species uncertainty in mark-recapture studies should aid future research investigating ecological concepts such as interspecific competition, niche differentiation, and spatial population dynamics in sibling species.  相似文献   

13.
Indoor radon is an important risk factor for human health. Indeed radon inhalation is considered the second cause of lung cancer after smoking. During the last decades, in many countries huge efforts have been made in order to measuring, mapping and predicting radon levels in dwellings. Various researches have been devoted to identify those areas within the country where high radon concentrations are more likely to be found. Data collected through indoor radon surveys have been analysed adopting various statistical approaches, among which hierarchical Bayesian models and geostatistical tools are worth noting. The essential goal of this paper regards the identification of high radon concentration areas (the so-called radon prone areas) in the Abruzzo Region (Italy). In order to accurately pinpoint zones deserving attention for mitigation purpose, we adopt spatial cluster detection techniques, traditionally employed in epidemiology. As a first step, we assume that indoor radon measurements do not arise from a continuous spatial process; thus the geographic locations of dwellings where the radon measurements have been taken can be viewed as a realization of a spatial point process. Following this perspective, we adopt and compare recent cluster detection techniques: the simulated annealing scan statistic, the case event approach based on distance regression on the selection order and the elliptic spatial scan statistic. The analysis includes data collected during surveys carried out by the Regional Agency for the Environment Protection of Abruzzo (ARTA) in 1,861 random sampled dwellings across 277 municipalities of the Abruzzo region. The radon prone areas detected by the selected approaches are provided along with the summary statistics of the methods. Finally, the methodologies considered in this paper are tested on simulated data in order to evaluate their power and the precision of cluster location detection.  相似文献   

14.
We propose a space-time stick-breaking process for the disease cluster estimation. The dependencies for spatial and temporal effects are introduced by using space-time covariate dependent kernel stick-breaking processes. We compared this model with the space-time standard random effect model by checking each model’s ability in terms of cluster detection of various shapes and sizes. This comparison was made for simulated data where the true risks were known. For the simulated data, we have observed that space-time stick-breaking process performs better in detecting medium- and high-risk clusters. For the real data, county specific low birth weight incidences for the state of South Carolina for the years 1997–2007, we have illustrated how the proposed model can be used to find grouping of counties of higher incidence rate.  相似文献   

15.
Spatial concurrent linear models, in which the model coefficients are spatial processes varying at a local level, are flexible and useful tools for analyzing spatial data. One approach places stationary Gaussian process priors on the spatial processes, but in applications the data may display strong nonstationary patterns. In this article, we propose a Bayesian variable selection approach based on wavelet tools to address this problem. The proposed approach does not involve any stationarity assumptions on the priors, and instead we impose a mixture prior directly on each wavelet coefficient. We introduce an option to control the priors such that high resolution coefficients are more likely to be zero. Computationally efficient MCMC procedures are provided to address posterior sampling, and uncertainty in the estimation is assessed through posterior means and standard deviations. Examples based on simulated data demonstrate the estimation accuracy and advantages of the proposed method. We also illustrate the performance of the proposed method for real data obtained through remote sensing.  相似文献   

16.
Many statistical tests have been developed to assess the significance of clusters of disease located around known sources of environmental contaminants, also known as focused disease clusters. The majority of focused-cluster tests were designed to detect a particular spatial pattern of clustering, one in which the disease cluster centers around the pollution source and declines in a radial fashion with distance. However, other spatial patterns of environmentally related disease clusters are likely given that the spatial dispersion patterns of environmental contaminants, and thus human exposure, depend on a number of factors (i.e., meteorology and topography). For this study, data were simulated with five different spatial patterns of disease clusters, reflecting potential pollutant dispersion scenarios: (1) a radial effect decreasing with increasing distance, (2) a radial effect with a defined peak and decreasing with distance, (3) a simple angular effect, (4) an angular effect decreasing with increasing distance and (5) an angular effect with a defined peak and decreasing with distance. The power to detect each type of spatially distributed disease cluster was evaluated using Stone’s Maximum Likelihood Ratio Test, Tango’s Focused Test, Bithell’s Linear Risk Score Test, and variations of the Lawson–Waller Score Test. Study findings underscore the importance of considering environmental contaminant dispersion patterns, particularly directional effects, with respect to focused-cluster test selection in cluster investigations. The effect of extra variation in risk also is considered, although its effect is not substantial in terms of the power of tests.  相似文献   

17.
Identifying the geometrical nature of spatial point patterns plays an important role in many areas of scientific research. Common types of spatial point processes involve random, regular, and cluster patterns. However, some point patterns suggest identifiable geometrical shapes such as a circular or other conic patterns. These patterns may be recognized as either a specific clustered shape or an inhomogeneous point pattern. Less noisy conic shapes, including circular patterns, are heavily discussed in the pattern recognition literature, but the goodness-of-fit of conic-fitting algorithms is rarely discussed for very noisy data. This study addresses a parameter estimation technique for noisy circular point patterns using the maximum likelihood principle. Additionally, a spatial statistical tool known as the L-function is used to investigate whether the fitted location pattern is reasonably attributable to a circular shape. A novel quantity named ‘relative log-error’ (\(\gamma \)) is introduced to quantify the goodness-of-fit for circular model fits. An iteratively re-weighted least squares procedure is introduced and robustness is evaluated under several error structures. Computational efficiency of the current and novel circle-fitting methods is also discussed. The findings are applied to two environmental science data sets.  相似文献   

18.
Fire managers need to study fire history in terms of occurrence in order to understand and model the spatial distribution of the causes of ignition. Fire atlases are useful open sources of information, recording each single fire event by means of its geographical position. In such cases the fire event is considered as point-based, rather than area-based data, completely losing its surface nature. Thus, an accurate method is needed to estimate continuous density surfaces from ignition points where location is affected by a certain degree of uncertainty. Recently, the fire scientific community has focused its attention on the kernel density interpolation technique in order to convert point-based data into continuous surface or surface-data. The kernel density technique needs a priori setting of smoothing parameters, such as the bandwidth size. Up to now, the bandwidth size was often based on subjective choices still needing expert knowledge, eventually supported by empirical decisions, thus leading to serious uncertainties. Nonetheless, a geostatistical model able to describe the point concentration and consequently the clustering degree is required. This paper tries to solve such issues by implementing the kernel density adaptive mode. Lightning/human-caused fires occurrence was investigated in the region of Aragón's autonomy over 19 years (1983–2001) using 3428 and 4195 ignition points respectively for the two causes of fire origin. An analytical calibration procedure was implemented to select the most reliable density surfaces to reduce under/over-density estimation, overcoming the current drawbacks to define it by visual inspection or personal interpretation. Besides, ignition point location uncertainty was investigated to check the sensitivity of the proposed model. The different concentration degree and the dissimilar spatial pattern of the two datasets, allow testing the proposed calibration methodology under several conditions. After having discovered the slight sensitivity of the model to the exact point position, the obtained density surfaces for the two causes were combined to discover hotspot areas and spatial patterns of the two causes. Evident differences in spatial location of the origin causes were noted and described. The general trend follows the geographical features and the human activity of the study areas. The proposed technique should be promising to support decision-making in wildfire prevention actions, because of the occurrence map can be used as a response variable in fire risk predicting models.  相似文献   

19.
Maps are useful tools for understanding, managing, and protecting the marine environment, yet few useful and statistically defensible maps of environmental quality and aquatic resources have been developed in near-coastal regions. Current environmental management efforts, such as ocean monitoring by sewage dischargers, routinely sample areas of potential impact using sparse sampling grids. Heterogeneous oceanic conditions often make extrapolation from these grids to non-sampled locations questionable. Although rarely applied in coastal monitoring, kriging offers a more rigorous statistical approach to mapping and allows confidence intervals to be calculated for predictions. Its usefulness relies on accurate models of the spatial variability through estimating the semivariogram. Many optimal designs for estimating the semivariogram have been proposed, but these designs are often difficult to implement in practice. In this paper, we present simple design strategies for augmenting existing monitoring designs with the goal of estimating the semivariogram. In particular, we investigate a multi-lag cluster design strategy, where clusters of sites, spaced at various lag distances, are placed around fixed stations on an existing sampling grid. We find that these multi-lag cluster designs provide improved accuracy in estimating the parameters of the semivariogram. Based on simulation study findings, we apply a multi-lag cluster enhancement to the monitoring grid for the City of San Diego’s Point Loma Wastewater Treatment Plant as part of a special study to map chemical contaminants in sediments around its sewage outfall.  相似文献   

20.
Multiple data sources are essential to provide reliable information regarding the emergence of potential health threats, compared to single source methods. Spatial Scan Statistics have been adapted to analyze multivariate data sources, but only ad hoc procedures have been devised to address the problem of selecting the most likely cluster and computing its significance. In this work, information from multiple data sources of disease surveillance is incorporated to achieve more coherent spatial cluster detection using tools from multi-criteria analysis. The best cluster solutions are found by maximizing two objective functions simultaneously, based on the concept of dominance. To evaluate the statistical significance of solutions, a statistical approach based on the concept of attainment function is used. The multi-criteria approach has several advantages: the representation of the evaluation function for each data source is clear, and does not suffer from an artificial, and possibly confusing mixture with the other data source evaluations; it is possible to attribute, in a rigorous way, the statistical significance of each candidate cluster; and it is possible to analyze and pick-up the best cluster solutions, as given naturally by the non-dominated set. The methodology is illustrated with real datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号