首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Chemosphere》2009,74(11):1701-1707
The aim was to develop a reliable and practical quantitative structure–activity relationship (QSAR) model validated by strict conditions for predicting bioconcentration factors (BCF). We built up several QSAR models starting from a large data set of 473 heterogeneous chemicals, based on multiple linear regression (MLR), radial basis function neural network (RBFNN) and support vector machine (SVM) methods. To improve the results, we also applied a hybrid model, which gave better prediction than single models. All models were statistically analysed using strict criteria, including an external test set. The outliers were also examined to understand better in which cases large errors were to be expected and to improve the predictive models. The models offer more robust tools for regulatory purposes, on the basis of the statistical results and the quality check on the input data.  相似文献   

2.
The aim was to develop a reliable and practical quantitative structure-activity relationship (QSAR) model validated by strict conditions for predicting bioconcentration factors (BCF). We built up several QSAR models starting from a large data set of 473 heterogeneous chemicals, based on multiple linear regression (MLR), radial basis function neural network (RBFNN) and support vector machine (SVM) methods. To improve the results, we also applied a hybrid model, which gave better prediction than single models. All models were statistically analysed using strict criteria, including an external test set. The outliers were also examined to understand better in which cases large errors were to be expected and to improve the predictive models. The models offer more robust tools for regulatory purposes, on the basis of the statistical results and the quality check on the input data.  相似文献   

3.
Quantitative structure-activity relationships (QSARs) urgently need to be applied in regulatory programs. Many QSAR models can predict the effect of a wide range of substances to different endpoints, particularly in the case of ecotoxicity, but it is difficult to choose the most appropriate model on the basis of the requirements of the application. During the EC-funded project DEMETRA (www.demetra-tox.net) a huge number of QSAR models have been developed for the prediction of different ecotoxicological endpoints. DEMETRA individual models on rainbow trout LC50 after 96 h, water flea LC50 after 48 h and honey bee LD50 after 48 h have been used as a QSAR database to test the advantages of a new index for evaluating model uncertainty. This index takes into consideration the number of outliers (weighted on the total number of compounds) and their root mean square error. Application on the DEMETRA QSAR database indicated that the index can identify the models with the best performance with regard to outliers, and can be used, together with other classical statistical measures (e.g., the squared correlation coefficient), to support the evaluation of QSAR models.  相似文献   

4.
5.
The widely used ECOSAR computer programme for QSAR prediction of chemical toxicity towards aquatic organisms was evaluated by using large data sets of industrial chemicals with varying molecular structures. Experimentally derived toxicity data covering acute effects on fish, Daphnia and green algae growth inhibition of in total more than 1,000 randomly selected substances were compared to the prediction results of the ECOSAR programme in order (1) to assess the capability of ECOSAR to correctly classify the chemicals into defined classes of aquatic toxicity according to rules of EU regulation and (2) to determine the number of correct predictions within tolerance factors from 2 to 1,000. Regarding ecotoxicity classification, 65% (fish), 52% (Daphnia) and 49% (algae) of the substances were correctly predicted into the classes "not harmful", "harmful", "toxic" and "very toxic". At all trophic levels about 20% of the chemicals were underestimated in their toxicity. The class of "not harmful" substances (experimental LC/EC(50)>100 mg l(-1)) represents nearly half of the whole data set. The percentages for correct predictions of toxic effects on fish, Daphnia and algae growth inhibition were 69%, 64% and 60%, respectively, when a tolerance factor of 10 was allowed. Focussing on those experimental results which were verified by analytically measured concentrations, the predictability for Daphnia and algae toxicity was improved by approximately three percentage points, whereas for fish no improvement was determined. The calculated correlation coefficients demonstrated poor correlation when the complete data set was taken, but showed good results for some of the ECOSAR chemical classes. The results are discussed in the context of literature data on the performance of ECOSAR and other QSAR models.  相似文献   

6.
A novel approach to predict aquatic toxicity from molecular structure   总被引:1,自引:0,他引:1  
The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2=0.791 and s=0.344) and stochastic (R2=0.799 and s=0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2=0.781 (scv=0.348) and q2=0.786 (scv=0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.  相似文献   

7.
Ranking of aquatic toxicity of esters modelled by QSAR   总被引:1,自引:0,他引:1  
  相似文献   

8.
9.
10.
Ashek A  Lee C  Park H  Cho SJ 《Chemosphere》2006,65(3):521-529
In the present study we have performed comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) on structurally diverse ligands of Ah (dioxin) receptor to explore the physico-chemical requirements for binding. All CoMFA and CoMSIA models have given q(2) value of more than 0.5 and r(2) value of more than 0.84. The predictive ability of the models was validated by an external test set, which gave satisfactory predictive r(2) values. Best predictions were obtained with CoMFA model of combined modified training set (q(2) = 0.631, r(2) = 0.900), giving predictive residual value = 0.02 log unit for the test compound. Addition of CoMSIA study has elucidated the role of hydrophobicity and hydrogen bonding along with the effect of steric and electrostatic properties revealed by CoMFA. We have suggested a model comprises of four structurally different compounds, which offers a good predictability for various ligands. Our QSAR model is consistent with all previously established QSAR models with less structurally diverse ligands.  相似文献   

11.
Convenient to apply and available on the Internet software CORAL (http://www.insilico.eu/CORAL) has been used to build up quantitative structure-activity relationships (QSAR) for prediction of cytotoxicity of metal oxide nanoparticles to bacteria Escherichia coli (minus logarithm of concentration for 50% effect pEC50). In this study six random splits of the data into the training and test set were examined. It has been shown that the CORAL provides a reliable tool that could be used to build up a QSAR of the pEC50.  相似文献   

12.
Ren S 《Chemosphere》2003,53(9):1053-1065
In ecotoxicology, mechanism-based quantitative structure-activity relationships (QSARs) are usually developed with higher quality than QSARs without regard to toxicity mechanism. Correctly determining the mechanism of a compound, which is not always easy, is required to use mechanism-based QSARs for toxicity prediction. The mechanism determination step may introduce extra errors in addition to the intrinsic prediction errors of mechanism-based QSARs, thus compromising these QSARs' performance compared with QSARs regardless of mechanism. In this study, the mechanism identification-toxicity prediction (MI-TP) approach was compared with the direct toxicity prediction (DTP) approach using a data set containing phenol toxicity to Tetrahymena pyriformis. A statistical mechanism classification model for mechanism prediction, four mechanism-based QSARs and a single QSAR without discriminating between mechanisms were developed for toxicity prediction. Toxicity of phenols in an external data set was predicted following the MI-TP and DTP approaches. Results indicated that the mechanisms of several phenols in the external test set were incorrectly predicted which led to significant over- or under-estimation of their toxicity. Overall, the MI-TP approach did not yield more accurate toxicity prediction than the DTP approach.  相似文献   

13.
14.
Stenberg M  Andersson PL 《Chemosphere》2008,71(10):1909-1915
The non-dioxin-like polychlorinated biphenyls (NDL-PCBs) constitute the major proportion of PCBs found in food and human tissues. It is important to improve our understanding of the toxicity, environmental and human risks associated with the NDL-PCBs, since their toxicology is incompletely characterized and a human health risk assessment is required. This paper discusses the selection of a training set of 20 tri- to hepta-chlorinated biphenyls, PCBs 19,28,47,51,52,53,74,95,100,101,104,118,122,128,136,138,153,170,180, and 190. Suggested for comprehensive screening using in vitro assays to identify critical mechanisms of toxicological action. The selected PCBs form a balanced basis for developing of quantitative structure-activity relationship (QSAR) models for prediction of physicochemical and toxicological properties of non-tested PCB congeners. Chemical and physical properties, environmental abundance and toxicological activities of the congeners were considered during the selection process. A complementary set of PCBs, a reference set, was selected using D-optimal onion design including PCBs 18,20,28,30,37,40,50,54,60,77,82,99,122,132,153,161,170,188,192, and 193. Congeners of this set are well suited for validation of QSAR models developed using the training set. For visualization of the chemical diversity of environmentally abundant PCBs and congeners of the training and reference sets, principal component analysis (PCA) was used. Statistical molecular design was used to verify the structural representation. As a reference structure for dioxin-like PCBs, PCB 126 was added in the training set. The selected set of NDL-PCBs is proposed for use in toxicological testing programs to provide rational basis for risk assessment of the NDL-PCBs.  相似文献   

15.
This paper presents one of the first applications of deep learning (DL) techniques to predict air pollution time series. Air quality management relies extensively on time series data captured at air monitoring stations as the basis of identifying population exposure to airborne pollutants and determining compliance with local ambient air standards. In this paper, 8 hr averaged surface ozone (O3) concentrations were predicted using deep learning consisting of a recurrent neural network (RNN) with long short-term memory (LSTM). Hourly air quality and meteorological data were used to train and forecast values up to 72 hours with low error rates. The LSTM was able to forecast the duration of continuous O3 exceedances as well. Prior to training the network, the dataset was reviewed for missing data and outliers. Missing data were imputed using a novel technique that averaged gaps less than eight time steps with incremental steps based on first-order differences of neighboring time periods. Data were then used to train decision trees to evaluate input feature importance over different time prediction horizons. The number of features used to train the LSTM model was reduced from 25 features to 5 features, resulting in improved accuracy as measured by Mean Absolute Error (MAE). Parameter sensitivity analysis identified look-back nodes associated with the RNN proved to be a significant source of error if not aligned with the prediction horizon. Overall, MAE's less than 2 were calculated for predictions out to 72 hours.

Implications: Novel deep learning techniques were used to train an 8-hour averaged ozone forecast model. Missing data and outliers within the captured data set were replaced using a new imputation method that generated calculated values closer to the expected value based on the time and season. Decision trees were used to identify input variables with the greatest importance. The methods presented in this paper allow air managers to forecast long range air pollution concentration while only monitoring key parameters and without transforming the data set in its entirety, thus allowing real time inputs and continuous prediction.  相似文献   


16.
17.
Chen D  Yin C  Wang X  Wang L 《Chemosphere》2004,57(11):1739-1745
The HQSAR (Holographic QSAR) method, which has been recently developed, can offer the ability to rapidly and easily generate QSAR models of high statistical quality and predictive value. HQSAR analysis requires selecting values for parameters that specify the size of the hologram that is to be used, and the size and type of fragment substructures that are to be encoded. The color coding is provided by HQSAR to reflect which molecular fragments may be important contributors to the biological activity. In this work, we studied the quantitative structure activity relationship of selected esters using the HQSAR method. A robust HQSAR model with r2 (non-cross-validated regression coefficient) of 0.981 and q2 (cross-validated regression coefficient) of 0.912, was developed after optimizing the fragment size and the hologram length. The color coding analysis, which has rarely been reported before, was done here to explain the outlier successfully.  相似文献   

18.
19.
Huuskonen J 《Chemosphere》2003,50(7):949-953
A quantitative structure-activity relationship model, based on the atom-type electrotopological state (E-state) indices, for the prediction of toxicity to fathead minnow for a diverse set of 140 organic chemicals is presented. Multiple linear regression and artificial neural network techniques were employed in the modeling of experimental toxicity (-logLC(50)) values ranging from 0.85 to 6.09. For the training set of 130 organic compounds a linear regression model with r(2)=0.84 and s=0.36 was obtained with 14 atom-type E-state indices. For the test set of 10 compounds, the corresponding statistics were r(2)=0.83 and s=0.47, respectively. Neural networks gave a significant improvement using the same set of parameters, and the standard deviations were s=0.31 for the training set and s=0.30 for the test set when an artificial neural network with five neurons in the hidden layer was used. The results clearly show that accurate models can be rapidly calculated for the prediction of toxicity for a diverse set of organic chemicals using easily calculated parameters.  相似文献   

20.

The safety assessment process of chemicals requires information on their mutagenic potential. The experimental determination of mutagenicity of a large number of chemicals is tedious and time and cost intensive, thus compelling for alternative methods. We have established local and global QSAR models for discriminating low and high mutagenic compounds and predicting their mutagenic activity in a quantitative manner in Salmonella typhimurium (TA) bacterial strains (TA98 and TA100). The decision treeboost (DTB)-based classification QSAR models discriminated among two categories with accuracies of >96% and the regression QSAR models precisely predicted the mutagenic activity of diverse chemicals yielding high correlations (R 2) between the experimental and model-predicted values in the respective training (>0.96) and test (>0.94) sets. The test set root mean squared error (RMSE) and mean absolute error (MAE) values emphasized the usefulness of the developed models for predicting new compounds. Relevant structural features of diverse chemicals that were responsible and influence the mutagenic activity were identified. The applicability domains of the developed models were defined. The developed models can be used as tools for screening new chemicals for their mutagenicity assessment for regulatory purpose.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号