首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
Composite sampling techniques for identifying the largest individual sample value seem to be cost effective when the composite samples are internally homogeneous. However, since it is not always possible to form homogeneous composite samples, these methods can lead to higher costs than expected. In this paper we propose a two-way composite sampling design as a way to improve on the cost effectiveness of the methods available to identify the largest individual sample value.  相似文献   

2.
Compositing of individual samples is a cost-effective method for estimating a population mean, but at the expense of losing information about the individual sample values. The largest of these sample values (hotspot) is sometimes of particular interest. Sweep-out methods attempt to identify the hotspot and its value by quantifying a (hopefully, small) subset of individual values as well as the usual quantification of the composites. Sweep-out design is concerned with the sequential selection of individual samples for quantification on the basis of all earlier quantifications (both composite and individual). The design-goal is for the number of individual quantifications to be small (ideally, minimal). Previous sweep-out designs have applied to traditional (i.e., disjoint) compositing. This paper describes a sweep-out design suitable for two-way compositing. That is, the individual samples are arranged in a rectangular array and a composite is formed from each row and also from each column. At each step, the design employs all available measurements (composite and individual) to form the best linear unbiased predictions for the currently unquantified cells. The cell corresponding to the largest predicted value is chosen next for individual measurement. The procedure terminates when the hotspot has been identified with certainty.  相似文献   

3.
Quantifying a composite sample results in a loss of information on the values of the constituent individual samples. As a consequence of this information loss, it is impossible to identify individual samples having large values, based on composite sample measurements alone. However, under certain circumstances, it is possible to identify individual samples having large values without exhaustively measuring all individual samples. In addition to composite sample measurements, a few additional measurements on carefully selected individual samples are sufficient to identify the individual samples having large values. In this paper, we present a statistical method to recover extremely large individual sample values using composite sample measurements. An application to site characterization is used to illustrate the method.The paper has been prepared with partial support from the United States Environmental Protection Agency Number CR815273. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

4.
When an environmental sampling objective is to classify all the sample units as contaminated or not, composite sampling with selective retesting can substantially reduce costs by reducing the number of units that require direct analysis. The tradeoff, however, is increased complexity that has its own hidden costs. For this reason, we propose a model for assessing the relative cost, expressed as the ratio of total expected cost with compositing to total expected cost without compositing (initial exhaustive testing). Expressions are derived for the following retesting protocols: (i) exhaustive, (ii) sequential and (iii) binary split. The effects of both false positive and false negative rates are also derived and incorporated. The derived expressions of relative cost are illustrated for a range of values for various cost components that reflect typical costs incurred with hazardous waste site monitoring. Results allow those who are designing sampling plans to evaluate if any of these compositing/retesting protocols will be cost effective for particular applications.  相似文献   

5.
This monograph on composite sampling, co-authored by Patil, Gore, and Taillie provides, for the first time, a most comprehensive statistical account of composite sampling as an ingenious environmental sampling method to help accomplish observational economy in a variety of environmental and ecological studies. Sampling consists of selection, acquisition, and quantification of a part of the population. But often what is desirable is not affordable, and what is affordable is not adequate. How do we deal with this dilemma? Operationally, composite sampling recognizes the distinction between selection, acquisition, and quantification. In certain applications, it is a common experience that the costs of selection and acquisition are not very high, but the cost of quantification, or measurement, is substantially high. In such situations, one may select a sample sufficiently large to satisfy the requirement of representativeness and precision and then, by combining several sampling units into composites, reduce the cost of measurement to an affordable level. Thus composite sampling offers an approach to deal with the classical dilemma of desirable versus affordable sample sizes, when conventional statistical methods fail to resolve the problem. Composite sampling, at least under idealized conditions, incurs no loss of information for estimating the population means. But an important limitation to the method has been the loss of information on individual sample values, such as the extremely large value. In many of the situations where individual sample values are of interest or concern, composite sampling methods can be suitably modified to retrieve the information on individual sample values that may be lost due to compositing. In this monograph, we present statistical solutions to these and other issues that arise in the context of applications of composite sampling. The monograph is published in the Monograph Series: Environmental and Ecological Statistics <http://www.springer.com/series/7506>, vol. 4, The authors are Patil, Ganapati P., Gore, Sharad D., Taillie, Charles with the monograph co-ordinates,1st Edition., 2011, XIII, 275 p. 47 illus., SpringerLink <http://www.springerlink.com/content/978-1-4419-7627-7>, Hardcover, >  ISBN 978-1-4419-7627-7.  相似文献   

6.
The high costs of laboratory analytical procedures frequently strain environmental and public health budgets. Whether soil, water or biological tissue is being analysed, the cost of testing for chemical and pathogenic contaminants can be quite prohibitive.Composite sampling can substantially reduce analytical costs because the number of required analyses is reduced by compositing several samples into one and analysing the composited sample. By appropriate selection of the composite sample size and retesting of select individual samples, composite sampling may reveal the same information as would otherwise require many more analyses.Many of the limitations of composite sampling have been overcome by recent research, thus bringing out more widespread potential for using composite sampling to reduce costs of environmental and public health assessments while maintaining and often increasing the precision of sample-based inference.  相似文献   

7.
Cleanup standards at hazardous waste sites include (i) numeric standards (often risk-based), (ii) background standards in which the remediated site is compared with data from a supposedly clean region, and (iii) interim standards in which the remediated site is compared with preremediation data from the same site. The latter are especially appropriate for verifying progress when an innovative, but unproven, technology is used for remediation. Standards of type (i) require one-sample statistical tests, while those of type (ii) and type (iii) call for two-sample tests. This paper considers two-sample tests with an emphasis upon the type (iii) scenario. Both parametric (likelihood ratio) and nonparametric (linear rank) protocols are examined. The methods are illustrated with preremediation data from a site on the National Priorities List. The results indicate that nonparametric procedures can be quite competitive (in terms of power) with distributional modelling provided a near optimal rank test is selected. Suggestions are given for identifying such rank tests. The results also confirm the importance of sound baseline sampling; no amount of post-remediation sampling can overcome baseline deficiencies.This paper has been prepared with partial support from the United States Environmental Protection Agency under a Cooperative Agreement Number CR-815273. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

8.
The objective of a long-term soil survey is to determine the mean concentrations of several chemical parameters for the pre-defined soil layers and to compare them with the corresponding values in the past. A two-stage random sampling procedure is used to achieve this goal. In the first step, n subplots are selected from N subplots by simple random sampling without replacement; in the second step, m sampling sites are chosen within each of the n selected subplots. Thus n · m soil samples are collected for each soil layer. The idea of the composite sample design comes from the challenge of reducing very expensive laboratory analyses: m laboratory samples from one subplot and one soil layer are physically mixed to form a composite sample. From each of the n selected subplots, one composite sample per soil layer is analyzed in the laboratory, thus n per soil layer in total. In this paper we show that the cost is reduced by the factor m — 1 when instead of the two-stage sampling its composite sample alternative is used; however, the variance of the composite sample mean is increased. In the case of positive intraclass correlation the increase is less than 12.5%; in the case of negative intraclass correlation the increase depends on the properties of the variable as well. For the univariate case we derive the optimal number of subplots and sampling sites. A case study is discussed at the end.  相似文献   

9.
The initial use of composite sampling involved the analysis of many negative samples with relatively high laboratory cost (Dorfman sampling). We propose a method of double compositing and compare its efficiency with Dorfman sampling. The variability of composite measurement samples has environmental interest (hot spots). The precision of these estimates depends on the kurtosis of the distribution; leptokurtic distributions (2 > 0) have increased precision as the number of field samples is increased. The opposite effect is obtained for platykurtic distributions. In the lognormal case, coverage probabilities are reasonable for < 0.5. The Poisson distribution can be associated with temporal compositing, of particular interest where radioactive measurements are taken. Sample size considerations indicate that the total sampling effort is directly proportional to the length of time sampled. If there is background radiation then increasing levels of this radiation require larger sample sizes to detect the same difference in radiation.  相似文献   

10.
Suppose fish are to be sampled from a stream. A fisheries biologist might ask one of the following three questions: ‘How many fish do I need to catch in order to see all of the species?’, ‘How many fish do I need to catch in order to see all species whose relative frequency is more than 5%?’, or ‘How many fish do I need to catch in order to see a member from each of the species A, B, and C?’. This paper offers a practical solution to such questions by setting a target sample size designed to achieve desired results with known probability. We present three sample size methods, one we call ‘exact’ and the others approximate. Each method is derived under assumed multinomial sampling, and requires (at least approximate) independence of draws and (usually) a large population. The minimum information needed to compute one of the approximate methods is the estimated relative frequency of the rarest species of interest. Total number of species is not needed. Choice of a sample size method depends largely on available computer resources. One approximation (called the ‘Monte Carlo approximation’) gets within ±6 units of exact sample size, but usually requires 20–30 minutes of computer time to compute. The second approximation (called the ‘ratio approximation’) can be computed manually and has relative error under 5% when all species are desired, but can be as much as 50% or more too high when exact sample size is small. Statistically, this problem is an application of the ‘sequential occupancy problem’. Three examples are given which illustrate the calculations so that a reader not interested in technical details can apply our results.  相似文献   

11.
Determining the optimum number of increments in composite sampling   总被引:1,自引:0,他引:1  
Composite sampling can be more cost effective than simple random sampling. This paper considers how to determine the optimum number of increments to use in composite sampling. Composite sampling terminology and theory are outlined and a method is developed which accounts for different sources of variation in compositing and data analysis. This method is used to define and understand the process of determining the optimum number of increments that should be used in forming a composite. The blending variance is shown to have a smaller range of possible values than previously reported when estimating the number of increments in a composite sample. Accounting for differing levels of the blending variance significantly affects the estimated number of increments.
John E. HathawayEmail:
  相似文献   

12.
Ranked set sampling: an annotated bibliography   总被引:1,自引:1,他引:1  
The paper provides an up-to-date annotated bibliography of the literature on ranked set sampling. The bibliography includes all pertinent papers known to the authors, and is intended to cover applications as well as theoretical developments. The annotations are arranged in chronological order and are intended to be sufficiently complete and detailed that a reading from beginning to end would provide a statistically mature reader with a state-of-the-art survey of ranked set sampling, including historical development, current status, and future research directions and applications. A final section of the paper gives a listing of all annotated papers, arranged in alphabetical order by author.This paper was prepared with partial support from the United States Environmental Protection Agency under a Cooperative Agreement Number CR-821531. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

13.
Resampling from stochastic simulations   总被引:1,自引:0,他引:1  
To model the uncertainty of an estimate of a global property, the estimation process is repeated on multiple simulated fields, with the same sampling strategy and estimation algorithm. As opposed to conventional bootstrap, this resampling scheme allows for spatially correlated data and the common situation of preferential and biased sampling. The practice of this technique is developed on a large data set where the reference sampling distributions are available. Comparison of the resampled distributions to that reference shows the probability intervals obtained by resampling to be reasonably accurate and conservative, provided the original and actual sample has been corrected for the major biases induced by preferential sampling.Andre G. Journel is a Professor of Petroleum Engineering at Stanford University with a joint appointment in the Department of Geological and Environmental Sciences. He is, also, Director of the Stanford Center for Reservoir Forecasting. Professor Journel has pioneered applications of geostatistical techniques in the mining/petroleum industry and extended his expertise to environmental applications and repository site characterization. Most notably, he developed the concept of non-parametric geostatistics and stochastic imaging with application to modeling uncertainty in reservoir/site characterization. Although the research described in this article has been supported by the United States Environmental Protection Agency under Cooperative Agreement CR819407, it has not been subjected to Agency review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred.  相似文献   

14.
The choice of neighborhood definition and critical value in adaptive cluster sampling is critical for designing an efficient survey. In designing an efficient adaptive cluster sample one should aim for a small difference between the initial and final sample size, and a small difference between the within-network and population variances. However, the two aims can be at odds with each other because small differences between initial and final sample size usually means small within-network variance. One way to help in designing an efficient survey is to think in terms of small network sizes since the network size is a function of both critical value and neighborhood definition. One should aim for networks that are small enough to ensure the final sample size is not excessively large compared with the initial sample size but large enough to ensure the within-network variance is a reasonable fraction of the population variance. In this study surveys that had networks that were two to four units in size were the most efficient.  相似文献   

15.
Ranked set sampling can provide an efficient basis for estimating parameters of environmental variables, particularly when sampling costs are intrinsically high. Various ranked set estimators are considered for the population mean and contrasted in terms of their efficiencies and useful- ness, with special concern for sample design considerations. Specifically, we consider the effects of the form of the underlying random variable, optimisation of efficiency and how to allocate sampling effort for best effect (e.g. one large sample or several smaller ones of the same total size). The various prospects are explored for two important positively skew random variables (lognormal and extreme value) and explicit results are given for these cases. Whilst it turns out that the best approach is to use the largest possible single sample and the optimal ranked set best linear estimator (ranked set BLUE), we find some interesting qualitatively different conclusions for the two skew distributions  相似文献   

16.
Material flow analysis (MFA) and value flow analysis (VFA) were applied to the sanitation system in an urban slum in Indonesia. Based on the results of the MFA and VFA, garbage and excreta disposal costs were evaluated to be 0.7% and 1.1%, respectively, of per capita income. Such value flows seem reasonable in light of the recognized affordability to pay (ATP) standard. However, current excreta disposal methods create negative impacts on downstream populations. Because such disadvantages do not go back to disposers, but passed to downstream, the current value flow structure does not motivate individual toilet users to install treatment facility. Based on current material and value flow structures, a resource recycling sanitation system scenario was examined. Based on VFA, an affordable initial cost for such a system was calculated; this was found to be comparable in price to a cheaper composting toilet that is currently available in the market.  相似文献   

17.
A new mathematical dose-response model for the expected probability of toxic response and also for the expected measure of the overdispersion parameter for the reproductive and developmental risk assessment is proposed. The model for the expected probability of toxic response is an improvised Weibull dose-response model incorporating the litter-size effect while the model for the overdispersion parameter is a polynomial function of the dose level. A beta-binomial distribution for the number of offspring showing toxic responses in a litter satisfactorily accounts for the extra-binomial variation and the intralitter correlation of responses of these pups. Confidence limits for low-dose extrapolation are based on the asymptotic distribution of the likelihood ratio. The safe dose for human exposure is then calculated by simple linear extrapolation. The model for overdispersion allows us to obtain the estimates of the overdispersion parameter at these dosages. This was not possible in the earlier models. The proposed model is illustrated by an application to a study on the effect of exposure to diethylhexylphthalate in mice. The results are compared with those obtained by Chen and Kodell (1989) who have applied the simple Weibull dose-response model to the same data set.This paper was prepared with partial support from the United States Environmental Protection Agency under a Cooperative Agreement Number CR-815273. The contents have not been subject to Agency review and therefore do not necessarily reflect the views or policies of the Agency and no official endorsement should be inferred.  相似文献   

18.
In phased sampling, data obtained in one phase is used to design the sampling network for the next phase. GivenN total observations, 1, ...,N phases are possible. Experiments were conducted with one-phase, two-phase, andN-phase design algorithms on surrogate models of sites with contaminated soils. The sampling objective was to identify through interpolation, subunits of the site that required remediation. The cost-effectiveness of alternate methods was compared by using a loss function. More phases are better, but in economic terms, the improvement is marginal. The optimal total number of samples is essentially independent of the number of phases. For two phase designs, 75% of samples in the first phase is near optimal; 20% or less is actually counterproductive.The U.S. Environmental Protection Agency (EPA) through its Office of Research and Development (ORD), partially funded and collaborated in the research described here. It has been subjected to the Agency's peer review and has been approved as an EPA publication. The U.S. Government has a non-exclusive, royalty-free licence in and to any copyright covering this article.  相似文献   

19.
The United States Environmental Protection Agency's Environmental Monitoring and Assessment Program (EMAP) is designed to describe status, trends and spatial pattern of indicators of condition of the nation's ecological resources. The proposed sampling design for EMAP is based on a triangular systematic grid and employs both variable probability and double sampling. The Horvitz-Thompson estimator provides the foundation of the design-based estimation strategy used in EMAP. However, special features of EMAP designed to accommodate the complexity of sampling environmental resources on a national scale require modifications of standard variance estimation procedures as well as development of new techniques. An overview of variance estimation methods proposed for application to EMAP's sampling strategy for discrete resources is presented.  相似文献   

20.
Adaptive cluster sampling (ACS) is a sampling technique for sampling rare and geographically clustered populations. Aiming to enhance the practicability of ACS while maintaining some of its major characteristics, an adaptive sample plot design is introduced in this study which facilitates field work compared to “standard” ACS. The plot design is based on a conditional plot expansion: a larger plot (by a pre-defined plot size factor) is installed at a sample point instead of the smaller initial plot if a pre-defined condition is fulfilled. This study provides insight to the statistical performance of the proposed adaptive plot design. A design-unbiased estimator is presented and used on six artificial and one real tree position maps to estimate density (number of objects per ha). The performance in terms of coefficient of variation is compared to the non-adaptive alternative without a conditional expansion of plot size. The adaptive plot design was superior in all cases but the improvement depends on (1) the structure of the sampled population, (2) the plot size factor and (3) the critical value (the minimum number of objects triggering an expansion). For some spatial arrangements the improvement is relatively small. The adaptive design may be particularly attractive for sampling in rare and compactly clustered populations with an appropriately chosen plot size factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号