首页> 美国卫生研究院文献>Database: The Journal of Biological Databases and Curation >Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases
【2h】

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases

机译:使用关联规则挖掘和本体从多个生物医学数据库生成元数据推荐

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Metadata—the machine-readable descriptions of the data—are increasingly seen as crucial for describing the vast array of biomedical datasets that are currently being deposited in public repositories. While most public repositories have firm requirements that metadata must accompany submitted datasets, the quality of those metadata is generally very poor. A key problem is that the typical metadata acquisition process is onerous and time consuming, with little interactive guidance or assistance provided to users. Secondary problems include the lack of validation and sparse use of standardized terms or ontologies when authoring metadata. There is a pressing need for improvements to the metadata acquisition process that will help users to enter metadata quickly and accurately. In this paper, we outline a recommendation system for metadata that aims to address this challenge. Our approach uses association rule mining to uncover hidden associations among metadata values and to represent them in the form of association rules. These rules are then used to present users with real-time recommendations when authoring metadata. The novelties of our method are that it is able to combine analyses of metadata from multiple repositories when generating recommendations and can enhance those recommendations by aligning them with ontology terms. We implemented our approach as a service integrated into the CEDAR Workbench metadata authoring platform, and evaluated it using metadata from two public biomedical repositories: US-based National Center for Biotechnology Information BioSample and European Bioinformatics Institute BioSamples. The results show that our approach is able to use analyses of previously entered metadata coupled with ontology-based mappings to present users with accurate recommendations when authoring metadata.
机译:元数据(数据的机器可读描述)越来越被认为对于描述当前存储在公共存储库中的大量生物医学数据集至关重要。尽管大多数公共存储库都有明确的要求,即元数据必须与提交的数据集一起出现,但是这些元数据的质量通常很差。关键问题在于,典型的元数据获取过程繁琐且耗时,几乎没有向用户提供交互指导或帮助。次要问题包括创作元数据时缺乏验证以及稀疏使用标准化术语或本体。迫切需要改进元数据获取过程,以帮助用户快速,准确地输入元数据。在本文中,我们概述了旨在解决此挑战的元数据推荐系统。我们的方法使用关联规则挖掘来发现元数据值之间的隐藏关联,并以关联规则的形式表示它们。然后,这些规则用于在创作元数据时向用户提供实时建议。我们的方法的新颖之处在于,它能够在生成推荐时结合来自多个存储库的元数据分析,并可以通过使它们与本体术语匹配来增强这些推荐。我们将我们的方法作为服务集成到CEDAR Workbench元数据创作平台中,并使用两个公共生物医学资源库中的元数据对其进行了评估:美国国家生物技术信息中心生物样本中心和欧洲生物信息学研究所生物样本中心。结果表明,我们的方法能够对先前输入的元数据进行分析,并结合基于本体的映射,从而在编写元数据时向用户提供准确的建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号