...
首页> 外文期刊>Archives >THE DATING OF UNDATED MEDIEVAL CHARTERS
【24h】

THE DATING OF UNDATED MEDIEVAL CHARTERS

机译:中世纪中世纪传奇的年代

获取原文
获取原文并翻译 | 示例
           

摘要

Approximately 95% of all English charters from the Conquest in 1066 to the beginning of the reign of Edward II in 1307 were issued without dates. One of the major objectives of the DEEDS Project (DEEDS, an acronym for Documents of Early England Data Set) at the University of Toronto has been to estimate dates of these undated documents through automation. This paper describes a World Wide Web user-interface toolkit to date the undated English charter, as well as the underlying two computationally intensive dating methodologies - the Maximum Prevalence and a distance based method. The Maximum Prevalence method, the more accurate of the two, relies on analyzing changes in the pattern of word and phrase usage as derived from a carefully selected collection containing thousands of dated documents electronically transcribed and stored in the DEEDS corpus. Over and above the dating of documents, the toolkit, which has features to visualize this pattern of change, is useful to historians, archivists and linguists alike. The distance- based method relies on computing the weighted sums of the dates of the documents in the DEEDS collection. The weights are determined on the basis of similarity between an undated document and the dated collection - the higher the similarity, the higher the weight; the reverse holds when the similarity is low. The performance of each of the dating methods is presented on a test set, where the average absolute errors for the Maximum Prevalence and the distance-based methods are found to be 7.6 and 12.5 years, respectively. A 'leave-one-out' cross-validation experiment performed on the more than 12,000 documents in the test set confirms the accuracy of the methodology. The strengths and weaknesses of each of the dating methods are discussed. In addition, a full description of the DEEDS corpus from England and continental Europe is provided, including the kinds of metadata that have been compiled from it.
机译:从1066年的《征服》到1307年的爱德华二世统治开始,大约95%的英国宪章都未注明日期。多伦多大学的DEEDS项目(DEEDS,早期英格兰数据集的缩写)的主要目标之一是通过自动化来估计这些未注明日期的文档的日期。本文介绍了用于日期未注明日期的英文版宪章的万维网用户界面工具包,以及基础的两种计算密集型约会方法-最大流行度和基于距离的方法。最大流行度方法(二者中最精确的一种)依赖于分析单词和短语使用方式的变化,这些变化是从精心挑选的集合中得出的,该集合包含成千上万份以电子方式转录并存储在DEEDS语料库中的已过时日期的文档。除了日期标注外,该工具包还具有可视化这种变化模式的功能,对历史学家,档案管理员和语言学家均非常有用。基于距离的方法依赖于计算DEEDS集合中文档日期的加权总和。权重是根据未注明日期的文件与注明日期的收藏集之间的相似性确定的–相似度越高,权重越高;当相似度低时,则相反。每种约会方法的性能都在测试集上显示,其中最大流行率和基于距离的方法的平均绝对误差分别为7.6年和12.5年。对测试集中超过12,000个文档执行的“留一出”交叉验证实验证实了该方法的准确性。讨论了每种约会方法的优点和缺点。此外,还提供了来自英格兰和欧洲大陆的DEEDS语料库的完整描述,包括从中汇编的元数据的种类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号