...
首页> 外文期刊>BMC Genomics >BMRF-MI: integrative identification of protein interaction network by modeling the gene dependency
【24h】

BMRF-MI: integrative identification of protein interaction network by modeling the gene dependency

机译:BMRF-MI:通过模拟基因依赖性蛋白质相互作用网络的整合鉴定

获取原文
           

摘要

Background Identification of protein interaction network is a very important step for understanding the molecular mechanisms in cancer. Several methods have been developed to integrate protein-protein interaction (PPI) data with gene expression data for network identification. However, they often fail to model the dependency between genes in the network, which makes many important genes, especially the upstream genes, unidentified. It is necessary to develop a method to improve the network identification performance by incorporating the dependency between genes. Results We proposed an approach for identifying protein interaction network by incorporating mutual information (MI) into a Markov random field (MRF) based framework to model the dependency between genes. MI is widely used in information theory to measure the uncertainty between random variables. Different from traditional Pearson correlation test, MI is capable of capturing both linear and non-linear relationship between random variables. Among all the existing MI estimators, we choose to use k-nearest neighbor MI (kNN-MI) estimator which is proved to have minimum bias. The estimated MI is integrated with an MRF framework to model the gene dependency in the context of network. The maximum a posterior (MAP) estimation is applied on the MRF-based model to estimate the network score. In order to reduce the computational complexity of finding the optimal network, a probabilistic searching algorithm is implemented. We further increase the robustness and reproducibility of the results by applying a non-parametric bootstrapping method to measure the confidence level of the identified genes. To evaluate the performance of the proposed method, we test the method on simulation data under different conditions. The experimental results show an improved accuracy in terms of subnetwork identification compared to existing methods. Furthermore, we applied our method onto real breast cancer patient data; the identified protein interaction network shows a close association with the recurrence of breast cancer, which is supported by functional annotation. We also show that the identified subnetworks can be used to predict the recurrence status of cancer patients by survival analysis. Conclusions We have developed an integrated approach for protein interaction network identification, which combines Markov random field framework and mutual information to model the gene dependency in PPI network. Improvements in subnetwork identification have been demonstrated with simulation datasets compared to existing methods. We then apply our method onto breast cancer patient data to identify recurrence related subnetworks. The experiment results show that the identified genes are enriched in the pathway and functional categories relevant to progression and recurrence of breast cancer. Finally, the survival analysis based on identified subnetworks achieves a good result of classifying the recurrence status of cancer patients.
机译:蛋白质相互作用网络的背景鉴定是理解癌症中分子机制的一个非常重要的步骤。已经开发了几种方法以将蛋白质 - 蛋白质相互作用(PPI)数据与基因表达数据集成进行网络识别。然而,它们通常未能在网络中基因之间的依赖性模拟,这使得许多重要的基因,特别是上游基因,身份化。必须通过结合基因之间的依赖性来开发一种方法来改善网络识别性能。结果我们提出了一种通过将互动信息(MI)结合到基于Markov随机字段(MRF)的框架来识别蛋白质交互网络的方法,以模拟基因之间的依赖性。 MI广泛用于信息理论,以测量随机变量之间的不确定性。与传统的Pearson相关性测试不同,MI能够在随机变量之间捕获线性和非线性关系。在所有现有的MI估算器中,我们选择使用K-Collest Ecmend Mi(Knn-MI)估计,这被证明可以具有最低偏差。估计的MI与MRF框架集成,以在网络的背景下模拟基因依赖性。最大后(MAP)估计用于基于MRF的模型来估计网络分数。为了降低找到最佳网络的计算复杂性,实现了概率搜索算法。我们通过应用非参数自动启动方法来测量所识别的基因的置信水平来进一步提高结果的鲁棒性和再现性。为了评估所提出的方法的性能,我们在不同条件下测试模拟数据的方法。与现有方法相比,实验结果表明,与子网识别方面的准确性提高。此外,我们将我们的方法应用于真正的乳腺癌患者数据;所鉴定的蛋白质相互作用网络表现出与乳腺癌复发的紧密关联,其由功能注释支持。我们还表明,已识别的子网可用于通过存活分析预测癌症患者的复发状态。结论我们已经开发了蛋白质交互网络识别的综合方法,它结合了马尔可夫随机现场框架和相互信息来模拟PPI网络中的基因依赖性。与现有方法相比,已对模拟数据集进行了对子网识别的改进。然后,我们将方法应用于乳腺癌患者数据以确定复发相关的子网。实验结果表明,鉴定的基因富集在途径和功能类别中,与乳腺癌的进展和复发相关。最后,基于已识别的子网的存活分析达到了分类癌症患者的复发状态的良好结果。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号