Metadata Extraction from Bibliographies Using Bigram HMM

机译：使用Bigram HMM从书目中提取元数据

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, we have seen huge volumes of research papers available on the World Wide Web. Metadata provides a good approach for organizing and retrieving these useful resources. Accordingly, automatic extraction of metadata from these papers and their bibliographies is meaningful and has been widely studied. In this paper, we utilize a bigram HMM (Hidden Markov Model) for automatic extraction of metadata (i.e. title, author, date, journal, pages, etc.) from bibliographies with various styles. Different from the traditional HMM, which only uses word frequency, this model also considers both words' bigram sequential relation and position information in text fields. We have evaluated the model on a real corpus downloaded from Web and compared it with other methods. Experiments show that the bigram HMM yields the best result and seem to be the most promising candidate for metadata extraction of bibliographies.

机译：近年来，我们在万维网上看到了大量的研究论文。元数据提供了一种组织和检索这些有用资源的好方法。因此，从这些论文及其书目中自动提取元数据是有意义的，并且已经被广泛研究。在本文中，我们利用bigram HMM（隐马尔可夫模型）从各种样式的书目中自动提取元数据（即标题，作者，日期，期刊，页面等）。与仅使用词频的传统HMM不同，该模型还考虑了词的双字母顺序关系和文本字段中的位置信息。我们已经在从Web下载的真实语料库上评估了该模型，并将其与其他方法进行了比较。实验表明，bigram HMM产生最好的结果，并且似乎是书目元数据提取的最有希望的候选者。

著录项

来源
《International Conference on Asian Digital Libraries(ICADL 2004); 20041213-17; Shanghai(CN)》|2004年|P.310-319|共10页
会议地点 Shanghai(CN)
作者
Ping Yin; Ming Zhang; ZhiHong Deng; DongQing Yang;
展开▼
作者单位

School of Electronics Engineering and Computer Science, Peking University, Beijing, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类电子图书馆、数字图书馆;
关键词

相似文献

外文文献
中文文献
专利

1. PSSM-Suc: Accurately predicting succinylation using position specific scoring matrix into bigram for feature extraction [J] . Dehzangi Abdollah, Lopez Yosvany, Lal Sunil Pranit, Journal of Theoretical Biology . 2017,第期

机译：PSSM-SUC：使用位置特异性评分矩阵准确地预测琥珀酰化，以进行特征提取的BIGRAM
2. A Novel Approach to the Extraction of Roots from Arabic Words Using Bigrams [J] . Ismail I. Hmeidi, Riyad F. Al-Shalabi, Ahmad T. Al-Taani, Journal of the American Society for Information Science and Technology . 2010,第3期

机译：一种使用双字母组从阿拉伯语单词中提取词根的新方法
3. Automatic Extraction of Protein Point Mutations Using a Graph Bigram Association [J] . Lawrence C Lee, Florence Horn, Fred E Cohen PLoS Computational Biology . 2007,第2期

机译：使用图Bigram关联自动提取蛋白质点突变
4. Metadata Extraction from Bibliographies Using Bigram HMM [C] . Ping Yin, Ming Zhang, ZhiHong Deng, International Conference on Asian Digital libraries . 2004

机译：使用Bigram HMM从参考书目提取的元数据
5. A Study on Improving Bibliographic Descriptions for Objects of Popular Culture Through Multimedia Franchise Representation, Hierarchical Modeling, and Metadata Aggregation [D] . Senan KIRYAKOS 2019

机译：通过多媒体特许经营表示，层次建模和元数据聚合来改善大众文化对象书目描述的研究
6. Enhancing Knowledge Graph Extraction and Validation From Scholarly Publications Using Bibliographic Metadata [O] . Houcemeddine Turki, Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha, 2021

机译：使用书目元数据提高学术出版物的知识图提取和验证
7. Metadata Extraction from Bibliographies Using Bigram HMM [O] . Ping Yin, Ming Zhang, Zhihong Deng, 2008

机译：使用Bigram HMM从书目中提取元数据

Metadata Extraction from Bibliographies Using Bigram HMM

摘要

著录项

相似文献

相关主题

期刊订阅