首页> 外文期刊>Journal of Integrative Bioinformatics >IDPredictor: predict database links in biomedical database
【24h】

IDPredictor: predict database links in biomedical database

机译:IDPredictor:预测生物医学数据库中的数据库链接

获取原文
           

摘要

Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data is spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure. To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments, an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge excerpt out of the interlinked databases. A prerequisit of supporting the concept of an integrated data view is to acquire insights into cross-references among database entities. This issue is being hampered by the fact, that only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems. In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predicts and extracts cross-references from multiple life science databases and possible referenced data targets. We study the retrieval quality of our method and report on first, promising results. The method is implemented as the tool IDPredictor, which is published under the DOI 10.5447/IPK/2012/4 and is freely available using the URL: http://dx.doi.org/10.5447/IPK/2012/4.
机译:在生物医学数据库中,尤其是在Web信息系统中发现的知识是一种主要的生物信息学资源。通常,该生物学知识在数据库网络中全球范围内体现。这些数据散布在成千上万个数据库中,这些数据库的内容重叠,但是在内容详细信息,界面,格式和数据结构方面却大不相同。为了支持实验室数据的功能注释,例如蛋白质序列,代谢物或DNA序列,以及在信息检索环境中进行半自动数据探索,对数据库的集成视图至关重要。搜索引擎具有帮助从这些结构化源中检索数据的潜力,但不足以从互连的数据库中提供全面的知识摘录。支持集成数据视图概念的先决条件是获得对数据库实体之间交叉引用的见解。该问题因以下事实而受到阻碍:在特定的生物医学信息系统中,所有可能的交叉引用中只有一小部分被明确标记。在这项工作中,我们研究将集成数据网络的自动化构建扩展到什么范围。我们提出了一种预测并从多个生命科学数据库和可能的参考数据目标中提取交叉引用的方法。我们研究了我们方法的检索质量,并报告了第一个有希望的结果。该方法实现为工具IDPredictor,该工具已在DOI 10.5447 / IPK / 2012/4下发布,可使用以下网址免费获得:URL:http://dx.doi.org/10.5447/IPK/2012/4。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号