首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Data Mining and Predictive Modeling of Biomolecular Network from Biomedical Literature Databases
【24h】

Data Mining and Predictive Modeling of Biomolecular Network from Biomedical Literature Databases

机译:生物医学文献数据库中生物分子网络的数据挖掘和预测建模

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present a novel approach Bio-IEDM (biomedical information extraction and data mining) to integrate text mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two phases. In phase 1, we discuss a semisupervised efficient learning approach to automatically extract biological relationships such as protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network. Our method automatically learns the patterns based on a few user seed tuples and then extracts new tuples from the biomedical literature based on the discovered patterns. The derived biomolecular network forms a large scale-free network graph. In phase 2, we present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks (communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical literature. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular network
机译:在本文中,我们提出了一种新颖的方法Bio-IEDM(生物医学信息提取和数据挖掘),该方法将文本挖掘和预测建模相集成,以从生物医学文献数据库中分析生物分子网络。我们的方法包括两个阶段。在阶段1中,我们讨论了一种半监督的高效学习方法,该方法可从生物医学文献数据库中自动提取生物关系,例如蛋白质-蛋白质相互作用,蛋白质-基因相互作用,以构建生物分子网络。我们的方法根据一些用户种子元组自动学习模式,然后根据发现的模式从生物医学文献中提取新的元组。导出的生物分子网络形成了无比例的大型网络图。在阶段2中,我们提出了一种新颖的聚类算法,用于分析生物分子网络图,以识别具有生物学意义的子网络(社区)。聚类算法考虑了无标度网络图的特征,并基于顶点的局部密度及其邻域函数,可用于查找具有不同密度级别的更有意义的聚类。实验结果表明,我们的方法在从大量生物医学文献中提取生物学知识方面非常有效。数据挖掘和信息提取的集成为分析生物分子网络提供了一个有希望的方向

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号