首页> 外文期刊>RSC Advances >High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods
【24h】

High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods

机译:基于各种数据分区,描述符选择和建模方法的酚类毒性高精度QSAR模型

获取原文
获取原文并翻译 | 示例
           

摘要

The environmental protection agency thinks that quantitative structure-activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (R-pred(2) = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.
机译:环保局认为定量结构 - 活动关系(QSAR)分析可以更好地取代毒性测试。在本文中,我们开发了QSAR方法来评估50个酚类似物的麻醉毒性。我们首先使用五个描述符和三个不同的分区构建多元的线性回归(MLR),逐步多元线性回归(SLR)并支持向量回归(SVR)模型,以及具有所有三个训练测试分区的最佳SVR模型具有最高的外部预测能够比文献中的模型高约10%。其次,要识别更有效的描述符,我们应用了两个内部方法来选择由PClient软件计算的1264个描述符的清晰含义的描述符,并使用它们来构建MLR,SLR和SVR模型。我们的研究结果表明,我们最佳的SVR模型(R-PRED(2)= 0.972)在测试集上显着增加了16.55%,并且适当的分区呈现了更好的稳定性。培训测试数据集的不同分区也支持了最佳SVR模型的优异预测力。我们进一步评估了我们SVR模型的回归意义以及根据解释性分析的模型的每个单个描述符的重要性。我们的工作提供了对数据分区,描述符选择和模型之间不同组合的有价值的探索,以及对酚类类似物毒性的有用理论认识,特别是对于这种小型数据集。

著录项

  • 来源
    《RSC Advances》 |2016年第108期|共9页
  • 作者单位

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

    Hunan Agr Univ Hunan Prov Key Lab Biol &

    Control Plant Dis &

    Ins Changsha 410128 Hunan Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号