首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper reviews four commonly used statistical methods for environmental data analysis and discusses potential pitfalls associated with application of these methods through real case study data. The four statistical methods are percentile and confidence interval, correlation coefficient, regression analysis, and analysis of variance (ANOVA). The potential pitfall for estimation of percentile and confidence interval includes the automatic assumption of a normal distribution to environmental data, which so often show a log-normal distribution. The potential pitfall for correlation coefficient includes the use of a wide range of data points in which the maximum in value may trivialize other smaller data points and consequently skew the correlation coefficient. The potential pitfall for regression analysis includes the propagation of uncertainties of input variables to the regression model prediction, which may be even more uncertain. The potential pitfall for ANOVA includes the acceptance of a hypothesis as a weak argument to imply a strong conclusion. As demonstrated in this paper, we may draw very different conclusions based on statistical analysis if the pitfalls are not identified. Reminder and enlightenment obtained from the pitfalls are given at the end of this article.  相似文献   

2.
Efraim Halfon 《Chemosphere》1985,14(9):1433-1440
A recently invented statistical method, the bootstrap, is used to verify whether a hypothesis, developed from a limited data set, would be valid if all possible data would have been available, i.e. this statistical method allows generalization to chemicals of the same class not included in the original analysis. The validity of the relation, hypothesized by Neely, between the water solubility of an organic chemical and the ratio of the acute fish LC50 at two different time periods has been tested. The hypothesis has been shown correct by first fitting a linear model with a geometric mean (GM) functional regression, which takes into account errors in both the independent and dependent variables, to compare observed and predicted ratios; the generality of the model has been tested by computing the confidence limits of the correlation coefficient, of the slope and intercept of the GM regression model using the bootstrap. The results show that the correlation between predicted and observed data is statistically significant within one standard deviation, but sometimes it may not be significant at the 95% confidence limit. Neely's model is probably correct but it might have a systematic bias which makes the theoretical ratio somewhat higher than the observed ratio.  相似文献   

3.
Environmental forensic analysis has evolved significantly from the early days of qualitative chemical fingerprint evaluations. The need for quantitative rigor has made the use of numerical methods critical in identifying and mapping contaminant sources in complex environmental systems. Given multiple contaminant sources, the environmental scientist is faced with the challenge of unraveling the contributions of multiple plumes with overlapping spatial and temporal distributions. The problem may be addressed through a multivariate statistical approach, but there is a mind-boggling array of the available "chemometric" methods. This paper provides an overview of these methods, along with a review of their advantages, disadvantages, and pitfalls. Methods discussed include principal component analysis and several receptor-modeling techniques.  相似文献   

4.
Environmental forensic analysis has evolved significantly from the early days of qualitative chemical fingerprint evaluations. The need for quantitative rigor has made the use of numerical methods critical in identifying and mapping contaminant sources in complex environmental systems. Given multiple contaminant sources, the environmental scientist is faced with the challenge of unraveling the contributions of multiple plumes with overlapping spatial and temporal distributions. The problem may be addressed through a multivariate statistical approach, but there is a mind-boggling array of the available “chemometric” methods. This paper provides an overview of these methods, along with a review of their advantages, disadvantages, and pitfalls. Methods discussed include principal component analysis and several receptor-modeling techniques.  相似文献   

5.
The purpose of this project was to investigate the relationship of ambient air quality measurements between two analytical methods, referred to as the total oxidant method and the chemiluminescent method. These two well documented analytical methods were run simultaneously, side by side, at a site on the Houston ship channel. They were calibrated daily. The hourly averages were analyzed by regression techniques and the confidence intervals were calculated for the regression lines. Confidence intervals for point estimates were also calculated. These methods were used with all data sets with values greater than 10 parts per billion and again with values greater than 30 parts per billion. A regression line was also calculated for a second set of data for the preceding year. These data were generated before a chromium triox-ide scrubber was installed to eliminate possible chemical interferences with the Kl method.

The results show that in general the chemiluminescent ozone method tends to produce values as much as two times higher than the simultaneous total oxidant values. In one set of data collected an 80 ppb chemiluminescent ozone value predicted a value of 43.9 ppb total oxidant with a 95% confidence interval of 7.7 to 80.4 ppb. In the second set of data an 80 ppb chemiluminescent ozone value predicted a value of 78 ppb total oxidant with a 95% confidence interval of 0.4 to 156 ppb. Other statistical analyses confirmed that either measurement was a very poor predictor of the other.  相似文献   

6.
The physical chemical equations relating solubility to octanol water partition coefficient are presented and used to develop a new correlation between these quantities which includes a melting point (fugacity ratio) correction. The correlation is satisfactory for 45 organic compounds but it is not applicable to organic acids. When applied to very high molecular weight (> 290) compounds the correlation is less satisfactory; either it is believed because the data are inaccurate or because the tendency for these compounds to partition into organic phases is less than expected. This may have profound environmental implications.  相似文献   

7.
Arsenic (As) has been proven to be highly toxic to humans, but limited attention has focused on exposure levels and potential risks to mother-neonate pairs of coastal populations. This study was conducted by examining the As concentration in colostrum and umbilical cord serum collected from 106 mother-neonate pairs living on Shengsi Island, facing the Yangtze River estuary and Hangzhou Bay in China. Average concentrations of total As in colostrum and cord serum were 18.51 ± 7.00 and 19.83 ± 10.50 μg L?1. One-way ANOVA analysis showed delivered ages and source of drinking water played significant roles in influencing the maternal exposure patterns. Correlation analysis indicated a significantly positive association between As concentrations in colostrum and cord serum. Multivariable linear regression models adjusted for other confounders clarified the dose-response relationship with a coefficient value of 0.23 and a 95 % confidence interval of (0.006, 0.492); p < 0.05. The calculated daily intake of total As for neonates through breastfeeding was in the range from 0.413 to 3.65 μg kg?1 body weight, and colostrum As, especially the most toxic species, inorganic arsenic (iAs), would pose a risk to neonates.  相似文献   

8.
There is a frequent need in the environmental sciences to show the similarity of the results given by two analytical methods. This cannot, however, be done within the conventional 'there is a difference' statistical hypothesis setting of, among others, Student's t-test. We demonstrate here a more appropriate approach that originates from drug testing and that can be applied with standard statistical software. It is a challenging approach, as it requires quantification of the similarity limit. If no pre-determined value is given for similarity, a potential data-supported similarity limit can be explored from the data. The approach has numerous other potential application areas, e.g. parallelism of regression slopes, homogeneity of variances and lack of interaction.  相似文献   

9.
Statistical analysis of regulatory ecotoxicity tests.   总被引:10,自引:0,他引:10  
ANOVA-type data analysis, i.e.. determination of lowest-observed-effect concentrations (LOECs), and no-observed-effect concentrations (NOECs), has been widely used for statistical analysis of chronic ecotoxicity data. However, it is more and more criticised for several reasons, among which the most important is probably the fact that the NOEC depends on the choice of test concentrations and number of replications and rewards poor experiments, i.e., high variability, with high NOEC values. Thus, a recent OECD workshop concluded that the use of the NOEC should be phased out and that a regression-based estimation procedure should be used. Following this workshop, a working group was established at the French level between government, academia and industry representatives. Twenty-seven sets of chronic data (algae, daphnia, fish) were collected and analysed by ANOVA and regression procedures. Several regression models were compared and relations between NOECs and ECx, for different values of x, were established in order to find an alternative summary parameter to the NOEC. Biological arguments are scarce to help in defining a negligible level of effect x for the ECx. With regard to their use in the risk assessment procedures, a convenient methodology would be to choose x so that ECx are on average similar to the present NOEC. This would lead to no major change in the risk assessment procedure. However, experimental data show that the ECx depend on the regression models and that their accuracy decreases in the low effect zone. This disadvantage could probably be reduced by adapting existing experimental protocols but it could mean more experimental effort and higher cost. ECx (derived with existing test guidelines, e.g., regarding the number of replicates) whose lowest bounds of the confidence interval are on average similar to present NOEC would improve this approach by a priori encouraging more precise experiments. However, narrow confidence intervals are not only linked to good experimental practices, but also depend on the distance between the best model fit and experimental data. At least, these approaches still use the NOEC as a reference although this reference is statistically not correct. On the contrary, EC50 are the most precise values to estimate on a concentration response curve, but they are clearly different from the NOEC and their use would require a modification of existing assessment factors.  相似文献   

10.
This study comprehensively characterizes hourly fine particulate matter (PM(2.5)) concentrations measured via a tapered element oscillating microbalance (TEOM), beta-gauge, and nephelometer from four different monitoring sites in U.S. Environment Protection Agency (EPA) Region 5 (in U.S. states Illinois, Michigan, and Wisconsin) and compares them to the Federal Reference Method (FRM). Hourly characterization uses time series and autocorrelation. Hourly data are compared with FRM by averaging across 24-hr sampling periods and modeling against respective daily FRM concentrations. Modeling uses traditional two-variable linear least-squares regression as well as innovative nonlinear regression involving additional meteorological variables such as temperature and humidity. The TEOM shows a relationship with season and temperature, linear correlation as low as 0.7924 and nonlinear model correlation as high as 0.9370 when modeled with temperature. The beta-gauge shows no relationship with season or meteorological variables. It exhibits a linear correlation as low as 0.8505 with the FRM and a nonlinear model correlation as high as 0.9339 when modeled with humidity. The nephelometer shows no relationship with season or temperature but a strong relationship with humidity is observed. A linear correlation as low as 0.3050 and a nonlinear model correlation as high as 0.9508 is observed when modeled with humidity. Nonlinear models have higher correlation than linear models applied to the same dataset. This correlation difference is not always substantial, which may introduce a tradeoff between simplicity of model and degree of statistical association. This project shows that continuous monitor technology produces valid PM(2.5) characterization, with at least partial accounting for variations in concentration from gravimetric reference monitors once appropriate nonlinear adjustments are applied. Although only one regression technically meets new EPA National Ambient Air Quality Standards (NAAQS) Federal Equivalent Method (FEM) correlation coefficient criteria, several others are extremely close, showing optimistic potential for use of this nonlinear adjustment model in garnering EPA NAAQS FEM approval for continuous PM(2.5) sampling methods.  相似文献   

11.
Abstract

This study comprehensively characterizes hourly fine particulate matter (PM2.5) concentrations measured via a tapered element oscillating microbalance (TEOM), β-gauge, and nephelometer from four different monitoring sites in U.S. Environment Protection Agency (EPA) Region 5 (in U.S. states Illinois, Michigan, and Wisconsin) and compares them to the Federal Reference Method (FRM). Hourly characterization uses time series and autocorrelation. Hourly data are compared with FRM by averaging across 24-hr sampling periods and modeling against respective daily FRM concentrations. Modeling uses traditional two-variable linear least-squares regression as well as innovative nonlinear regression involving additional meteorological variables such as temperature and humidity. The TEOM shows a relationship with season and temperature, linear correlation as low as 0.7924 and nonlinear model correlation as high as 0.9370 when modeled with temperature. The β-gauge shows no relationship with season or meteorological variables. It exhibits a linear correlation as low as 0.8505 with the FRM and a nonlinear model correlation as high as 0.9339 when modeled with humidity. The nephelometer shows no relationship with season or temperature but a strong relationship with humidity is observed. A linear correlation as low as 0.3050 and a nonlinear model correlation as high as 0.9508 is observed when modeled with humidity. Nonlinear models have higher correlation than linear models applied to the same dataset. This correlation difference is not always substantial, which may introduce a tradeoff between simplicity of model and degree of statistical association. This project shows that continuous monitor technology produces valid PM2.5 characterization, with at least partial accounting for variations in concentration from gravimetric reference monitors once appropriate nonlinear adjustments are applied. Although only one regression technically meets new EPA National Ambient Air Quality Standards (NAAQS) Federal Equivalent Method (FEM) correlation coefficient criteria, several others are extremely close, showing optimistic potential for use of this nonlinear adjustment model in garnering EPA NAAQS FEM approval for continuous PM2.5 sampling methods.  相似文献   

12.
The source of crude oils and petroleum products released into navigable waterways and shipping lanes is not always known. Thus, the defensible identification of spilled crude oils and petroleum products and their correlation to suspected sources is a critical part of many oil spill assessments. Quantitative "fingerprinting" analysis, when evaluated using straightforward statistical and numerical analyses, provides a defensible means to differentiate among qualitatively similar oils and provides the best assessment of the source(s) for spilled oils. Polycyclic aromatic hydrocarbon (PAH) and petroleum biomarker concentration data are a particularly useful quantitative measure that can benefit most oil spill investigations. In this paper the strategy and methodology for correlation analysis that relies upon quantitative gas chromatography/mass spectrometry operated in the selected ion monitoring mode (GC/MS-SIM) is demonstrated in a case study involving 66 candidate sources for a heavy fuel oil spill of unknown origin. The strategy includes identification of 19 chemical indices (out of 45 evaluated) based upon PAH's and biomarkers that were (1) independent of weathering; and (2) precisely measured, both of which are determined by statistical analysis of the data. The 19 chemical indices meeting these criteria are subsequently analysed using principal component analysis (PCA), which helps to determine defensibly the "prime suspects" for the oil spill under investigation. The strategy and methodology described, which combines statistical and numerical analysis of quantitative chemical data, can be adapted and applied to other environmental forensic investigations with the objective of correlating any form of contamination to its suspected sources.  相似文献   

13.
Water in the urban front-range corridor of Colorado has become an increasingly critical resource as the state faces both supply issues as well as anthropogenic degradation of water quality in several aquifers used for drinking water. A proposed development (up to 1100 homes over two quarter-quarter sections) at Todd Creek, Colorado, a suburb of Westminster located about 20 miles northeast of Denver, is considering use of onsite wastewater systems (OWS) to treat and remove domestic wastewater. Local health and environmental agencies have concerns for potential impacts to local water quality. Nitrogen treatment in the vadose zone and subsequent transport to ground water at a development scale is the focus of this investigation. The numerical model HYDRUS 1D was used, with input based on site-specific data and several transport parameters estimated from statistical distribution, to simulate nitrate concentrations reaching ground water. The model predictions were highly sensitive to mass-loading of nitrogen from OWS and the denitrification rate coefficient. The mass loading is relatively certain for the large number of proposed OWS. However, reasonable values for the denitrification rate coefficients vary over three orders of magnitude. Using the median value from a cumulative frequency distribution function, based on rates obtained from the literature, resulted in simulated output nitrate concentrations that were less than 1% of regulatory maximum concentrations. Reasonable rates at the lower end of the reported range, corresponding to lower 95% confidence interval estimates, result in simulated nitrate concentrations reaching groundwater above regulatory limits.  相似文献   

14.
The source of crude oils and petroleum products released into navigable waterways and shipping lanes is not always known. Thus, the defensible identification of spilled crude oils and petroleum products and their correlation to suspected sources is a critical part of many oil spill assessments. Quantitative “fingerprinting” analysis, when evaluated using straightforward statistical and numerical analyses, provides a defensible means to differentiate among qualitatively similar oils and provides the best assessment of the source(s) for spilled oils. Polycyclic aromatic hydrocarbon (PAH) and petroleum biomarker concentration data are a particularly useful quantitative measure that can benefit most oil spill investigations. In this paper the strategy and methodology for correlation analysis that relies upon quantitative gas chromatography/mass spectrometry operated in the selected ion monitoring mode (GC/MS-SIM) is demonstrated in a case study involving 66 candidate sources for a heavy fuel oil spill of unknown origin. The strategy includes identification of 19 chemical indices (out of 45 evaluated) based upon PAH's and biomarkers that were (1) independent of weathering; and (2) precisely measured, both of which are determined by statistical analysis of the data. The 19 chemical indices meeting these criteria are subsequently analysed using principal component analysis (PCA), which helps to determine defensibly the “prime suspects” for the oil spill under investigation. The strategy and methodology described, which combines statistical and numerical analysis of quantitative chemical data, can be adapted and applied to other environmental forensic investigations with the objective of correlating any form of contamination to its suspected sources.  相似文献   

15.
16.
This paper presents an evaluation of four gaussian (GM, HIWAY, AIRPOL-4, CALINE-2), and three numerical (DANARD, MROAD 2, ROADS) models with the tracer gas data collected in the General Motors experiment. Various statistical techniques are employed to quantify the predictive capability of each of the above models. In general, the three numerical models performed rather poorly compared to the gaussian models. For this data set, the model with the best performance in accurately predicting the measured concentrations was the GM model followed in order by AIRPOL-4, HIWAY, CALINE-2, DANARD, MR0AD2, and ROADS. Although the GM model provides by far a better simulation than any of the models tested here, it is skewed toward underprediction. As a screening tool for regulatory purposes, however, HIWAY model would be useful since this model has the highest percentage in the category of overprediction if the concentration data in the range of 50th percentile through 100th percentile are included in the analysis. The present version of the HIWAY model for stable and parallel wind-road conditions warrants modifications to improve its predictive capability. Current studies indicate that the modified HIWAY model can be used with greater confidence by the regulatory agencies.  相似文献   

17.
Fuzzy QSARs for predicting logKoc of persistent organic pollutants   总被引:2,自引:0,他引:2  
Uddameri V  Kuchanur M 《Chemosphere》2004,54(6):771-776
  相似文献   

18.
Juan CY  Green M  Thomas GO 《Chemosphere》2002,46(7):1091-1097
The statistical treatment of data-sets from environmental pollutant studies in which different measurements are combined to produce averages or comparative factors (e.g., transfer coefficients (TCs), input-output balance values) are considered here, with particular reference to the analysis of data from input-output balance studies of pollutants such as PCBs in animals and humans. Many methods of statistical analysis ignore the fact that all measurements are subject to error, and generally assume that the normal distribution applies to all data-sets, which is commonly inappropriate for environmental (and particularly biological system) data. Considerably different estimations can be obtained by applying different, commonly used, statistical methods, as shown in a simulation study presented here and when applied to data from an input-output balance study of PCBs in humans. Alternative average and combined factor estimators for the treatment of data from these types of studies that give considerable advantages in terms of bias and the ease of assessment of accuracy are proposed.  相似文献   

19.
Groundwater hydrochemistry of an urban industrial region in Indo-Gangetic plains of north India was investigated. Groundwater samples were collected both from the industrial and non-industrial areas of Kanpur. The hydrochemical data were analyzed using various water quality indices and nonparametric statistical methods. Principal components analysis (PCA) was performed to identify the factors responsible for groundwater contamination. Ensemble learning-based decision treeboost (DTB) models were constructed to develop discriminating and regression functions to differentiate the groundwater hydrochemistry of the three different areas, to identify the responsible factors, and to predict the groundwater quality using selected measured variables. The results indicated non-normal distribution and wide variability of water quality variables in all the study areas, suggesting for nonhomogenous distribution of sources in the region. PCA results showed contaminants of industrial origin dominating in the region. DBT classification model identified pH, redox potential, total-Cr, and λ 254 as the discriminating variables in water quality of the three areas with the average accuracy of 99.51 % in complete data. The regression model predicted the groundwater chemical oxygen demand values exhibiting high correlation with measured values (0.962 in training; 0.918 in test) and the respective low root mean-squared error of 2.24 and 2.01 in training and test arrays. The statistical and chemometric approaches used here suggest that groundwater hydrochemistry differs in the three areas and is dominated by different variables. The proposed methods can be used as effective tools in groundwater management.  相似文献   

20.
Spearman's rank correlation coefficient is a useful tool for exploratory data analysis in environmental forensic investigations. In this application it is used to detect monotonic trends in chemical concentration with time or space.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号