首页> 外文期刊>Records management journal >From tree to network: reordering an archival catalogue
【24h】

From tree to network: reordering an archival catalogue

机译:从树到网络:重新排序档案目录

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose - This paper presents the results of a number of experiments performed at the National Archives, all related to the theme of linking collections of records. This paper aims to present a methodology for translating a hierarchy into a network structure using a number of methods for deriving statistical distributions from records metadata or content and then aggregating them. Simple similarity metrics are then used to compare and link, collections of records with similar characteristics. Design/methodology/approach - The approach taken is to consider a record at any level of the catalogue hierarchy as a summary of its children. A distribution for each child record is created (e.g. word counts and date distribution) and averaged/summed with the other children. This process is repeated up the hierarchy to find a representative distribution of the whole series. By doing this the authors can compare record series together and create a similarity network. Findings - The summarising method was found to be applicable not only to a hierarchical catalogue but also to web archive data, which is by nature stored in a hierarchical folder structure. The case studies raised many questions worthy of further exploration such as how to present distributions and uncertainty to users and how to compare methods, which produce similarity scores on different scales. Originality/value - Although the techniques used to create distributions such as topic modelling and word frequency counts, are not new and have been used to compare documents, to the best of the knowledge applying the averaging approach to the archival catalogue is new. This provides an interesting method for zooming in and out of a collection, creating networks at different levels of granularity according to user needs.
机译:目的 - 本文介绍了在国家档案中进行了许多实验的结果,均与联系记录收藏的主题有关。本文旨在介绍一种使用许多方法将层次结构转换为网络结构的方法,用于从记录元数据或内容中派生统计分布,然后聚合它们。然后使用简单的相似度指标来比较和链接,具有相似特征的记录集合。设计/方法/方法 - 采取的方法是考虑任何级别的目录层次结构的记录作为其子女的摘要。创建每个子程度记录的分发(例如,单词计数和日期分发),并与其他子项平均/汇总。该过程重复了层次结构以找到整个系列的代表性分布。通过这样做,作者可以将记录系列组合在一起并创建相似网络。调查结果 - 发现总结方法不仅适用于分层目录,还可以应用于Web归档数据,这是由存储在分层文件夹结构中的自然。案例研究提出了许多值得进一步的探索的问题,例如如何向用户呈现分布和不确定性以及如何比较的方法,这些方法在不同的尺度上产生相似性分数。原创性/值 - 尽管用于创建产品如主题建模和字频计数等的技术并不是新的,并且已被用于比较文档,以最佳应用程序应用于档案目录的平均方法是新的。这提供了一种有趣的方法,用于根据用户需求在不同粒度级别创建网络的有趣方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号