首页> 外文学位 >Computational prediction of protein-protein interactions on the proteomic scale using Bayesian ensemble of multiple feature databases.
【24h】

Computational prediction of protein-protein interactions on the proteomic scale using Bayesian ensemble of multiple feature databases.

机译:使用多特征数据库的贝叶斯集合在蛋白质组学规模上蛋白质-蛋白质相互作用的计算预测。

获取原文
获取原文并翻译 | 示例

摘要

In the post-genomic world, one of the most important and challenging problems is to understand protein-protein interactions (PPIs) on a large scale. They are integral to the underlying mechanisms of most of the fundamental cellular processes. A number of experimental methods such as protein affinity chromatography, affinity blotting, and immunoprecipitation have traditionally helped in detecting PPIs on a small scale. Recently, high-throughput methods have made available an increasing amount of PPI data. However, this data contains a significant amount of erroneous information in the form of false positives and false negatives and shows little overlap among PPIs pooled from different methods, thus severely limiting their reliability. Because of such limitations, computational predictions are emerging to narrow down the set of putative PPIs.;In this dissertation, a novel computational PPI predictor was devised to predict PPIs with high accuracy. The PPI predictor integrates a number of proteomic features derived from biological databases. The features chosen for the purpose of this research were gene expression, gene ontology, MIPS functions, sequence patterns such as motifs and domains, and protein essentiality. While these features have little or no correlation with each other, they share some degree of relationship with the ability of proteins to interact with each other. Therefore, novel feature specific approaches were devised to characterize that relationship. Text mining and network topology based approaches were also studied. Gold Standard data comprising of high confidence PPIs and non-PPIs was used as evidence of interaction or lack thereof.;The predictive power of the individual features was integrated using Bayesian methods. The average accuracy, based on 10-fold cross-validation, was found to be 0.9396. Since all the features are computed on the proteomic scale, the Bayesian integration yields likelihood values for all possible combinations of proteins in the proteome. This has the added benefit of making it possible to enlist putative PPIs in a decreasing order of confidence measure in the form of likelihood values.;Integration of novel PPIs with other relevant biological information using Semantic Web representation was examined to better understand the underlying mechanism of diseases and novel target identification for drug discovery.
机译:在后基因组世界中,最重要和最具挑战性的问题之一是要大规模地理解蛋白质-蛋白质相互作用(PPI)。它们是大多数基本细胞过程的基本机制必不可少的。传统上,许多实验方法(例如蛋白亲和色谱法,亲和印迹和免疫沉淀法)有助于小规模检测PPI。最近,高通量方法已使越来越多的PPI数据可用。但是,此数据包含大量误报和误报形式的错误信息,并且从不同方法汇总的PPI之间几乎没有重叠,因此严重限制了它们的可靠性。由于这种局限性,计算预测正在出现,以缩小假定的PPI的范围。本论文设计了一种新颖的计算PPI预测器来高精度地预测PPI。 PPI预测子整合了许多来自生物学数据库的蛋白质组学特征。选择用于本研究目的的特征是基因表达,基因本体论,MIPS功能,序列模式(例如基序和结构域)以及蛋白质必需性。虽然这些特征彼此之间几乎没有关联,但它们与蛋白质彼此相互作用的能力有着某种程度的关联。因此,设计了新颖的特定于特征的方法来表征这种关系。还研究了文本挖掘和基于网络拓扑的方法。包含高置信度PPI和非PPI的黄金标准数据被用作相互作用或缺乏相互作用的证据。单个特征的预测能力使用贝叶斯方法进行了整合。基于10倍交叉验证的平均准确度为0.9396。由于所有特征都是在蛋白质组学规模上计算的,因此贝叶斯积分可得出蛋白质组中蛋白质所有可能组合的似然值。这具有额外的好处,即可以以可能性值的形式以降低的置信度来招募推定的PPI .;使用语义网表示法研究了新型PPI与其他相关生物学信息的整合,以更好地了解PPI的潜在机制。疾病和用于药物发现的新型靶标鉴定。

著录项

  • 作者

    Kumar, Vivek.;

  • 作者单位

    The University of Akron.;

  • 授予单位 The University of Akron.;
  • 学科 Engineering Biomedical.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 188 p.
  • 总页数 188
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号