...
首页> 外文期刊>Journal of Cheminformatics >Standards-based curation of a decade-old digital repository dataset of molecular information
【24h】

Standards-based curation of a decade-old digital repository dataset of molecular information

机译:基于标准的分子信息数字存储数据集的策划

获取原文
           

摘要

Background The desirable curation of 158,122 molecular geometries derived from the NCI set of reference molecules together with associated properties computed using the MOPAC semi-empirical quantum mechanical method and originally deposited in 2005 into the Cambridge DSpace repository as a data collection is reported. Results The procedures involved in the curation included annotation of the original data using new MOPAC methods, updating the syntax of the CML documents used to express the data to ensure schema conformance and adding new metadata describing the entries together with a XML schema transformation to map the metadata schema to that used by the DataCite organisation. We have adopted a granularity model in which a DataCite persistent identifier (DOI) is created for each individual molecule to enable data discovery and data metrics at this level using DataCite tools. Conclusions We recommend that the future research data management (RDM) of the scientific and chemical data components associated with journal articles (the “supporting information”) should be conducted in a manner that facilitates automatic periodic curation.
机译:背景技术报道了从NCI参比分子组中获得的158,122个分子几何的理想固化方法,以及使用MOPAC半经验量子力学方法计算出的相关属性,并于2005年作为数据收集而首次存放在Cambridge DSpace存储库中。结果策划所涉及的过程包括使用新的MOPAC方法注释原始数据,更新用于表示数据的CML文档的语法以确保架构一致性,添加描述条目的新元数据以及XML架构转换以映射数据库。 DataCite组织使用的元数据架构。我们采用了一种粒度模型,其中为每个分子创建了DataCite持久标识符(DOI),以使用DataCite工具在此级别上进行数据发现和数据度量。结论我们建议对与期刊文章相关的科学和化学数据组成部分(“支持信息”)的未来研究数据管理(RDM)应采用有利于自动定期整理的方式进行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号