...
首页> 外文期刊>Molecular informatics >Towards Proteome-Wide Interaction Models Using the Proteochemometrics Approach
【24h】

Towards Proteome-Wide Interaction Models Using the Proteochemometrics Approach

机译:使用蛋白质化学计量学方法建立蛋白质组间相互作用模型

获取原文
获取原文并翻译 | 示例
           

摘要

A proteochemometrics model was induced from all interaction data in the BindingDB database, comprizing in all 7078 protein-ligand complexes with representatives from all major drug target categories. Proteins were represented by alignment-independent sequence descriptors holding information on properties such as hydrophobicity, charge, and secondary structure. Ligands were represented by commonly used QSAR descriptors. The inhibition constant (pK~i)values of protein-ligand complexes were discre-tized into "high" and "low" interaction activity. Different machine-learning techniques were used to induce models relating protein and ligand properties to the interaction activity. The best was decision trees, which gave an accuracy of 80% and an area under the ROC curve of 0.81. The tree pointed to the protein and ligand properties, which are relevant for the interaction. As the approach does neither require alignments nor knowledge of protein 3D structures virtually all available protein-ligand interaction data could be utilized, thus opening a way to completely general interaction models that may span entire proteomes.
机译:从BindingDB数据库中的所有相互作用数据中导出了一个蛋白质化学计量学模型,该模型对所有7078种蛋白质-配体复合物进行了比较,并包含了所有主要药物靶点类别的代表。蛋白质由不依赖比对的序列描述符表示,该描述符包含有关疏水性,电荷和二级结构等性质的信息。配体由常用的QSAR描述符表示。蛋白质-配体复合物的抑制常数(pK〜i)值可分为“高”和“低”相互作用。使用不同的机器学习技术来诱导将蛋白质和配体性质与相互作用活性相关的模型。最好的是决策树,决策树的准确度为80%,ROC曲线下的面积为0.81。该树指出了蛋白质和配体的性质,它们与相互作用有关。由于该方法既不需要比对也不需要蛋白质3D结构的知识,实际上可以利用所有可用的蛋白质-配体相互作用数据,从而为跨越整个蛋白质组的完全通用的相互作用模型开辟了道路。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号