Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets

Jeremiah Flannery

首页> 外文期刊>International Journal of Librarianship >Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets

【24h】

Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets

机译：使用NLP为Notre Dame的天主教小册子生成Marc摘要字段

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Three NLP (Natural Language Processing) automated summarization techniques were tested on a special collection of Catholic Pamphlets acquired by Hesburgh Libraries. The automated summaries were generated after feeding the pamphlets as .pdf files into an OCR pipeline. Extensive data cleaning and text preprocessing were necessary before the computer summarization algorithms could be launched. Using the standard ROUGE F1 scoring technique, the Bert Extractive Summarizer technique had the best summarization score. It most closely matched the human reference summaries. The BERT Extractive technique yielded an average Rouge F1 score of 0.239. The Gensim python package implementation of TextRank scored at .151. A hand-implemented TextRank algorithm created summaries that scored at 0.144. This article covers the implementation of automated pipelines to read PDF text, the strengths and weakness of automated summarization techniques, and what the successes and failures of these summaries mean for their potential to be used in Hesburgh Libraries.

机译：在由Hesburgh图书馆收购的特殊集合的天主教小册子上进行了三种NLP（自然语言处理）自动摘要技术。在将小册子送入到OCR管道之后，生成自动摘要。在启动计算机摘要算法之前，需要广泛的数据清洁和文本预处理。使用标准Rouge F1评分技术，BERT Extractic Sumparizer技术具有最佳总结分数。它最接近与人权摘要相匹配。 BERT萃取技术产生平均胭脂F1得分为0.239。 Gensim Python封装在第011页上得分Textrank的实施。手工制定的Textrank算法创建了在0.144时得分的摘要。本文介绍了自动化管道的实施，以阅读PDF文本，自动摘要技术的优势和弱点，以及这些摘要的成功和失败是什么意思，因为他们的潜力将在HELBURGH图书馆中使用。

著录项

来源
《International Journal of Librarianship》 |2020年第1期|共17页
作者
Jeremiah Flannery;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Natural Language ProcessingSpecial CollectionsSummarizationExtractiveSummarizationMachine LearningCatholic Pamphlet;

机译：自然语言处理阶段集合估计抑制蛋白化MACHINE学习言语小册子;

相似文献

外文文献
中文文献
专利

1. Catholic Culture in Early Modern England. Edited by Ronald Corthell, Frances E. Dolan, Christopher Highley, and Arthur F. Marotti. Notre Dame, Ind.: University of Notre Dame Press, 2007. v + 324 pp. $40.00 paper. [J] . Robert Trisco Church History . 2009,第4期

机译：近代英格兰的天主教文化。由Ronald Corthell，Frances E. Dolan，Christopher Highley和Arthur F. Marotti编辑。圣母院，印第安纳州：圣母大学出版社，2007年。v+ 324页，每卷$ 40.00。
2. Cuban Catholics in the United States, 1960–1980: Exile and Integration. By Gerald E. Poyo. (Notre Dame: University of Notre Dame Press, 2007. xiv, 367 pp. Cloth, $65.00, ISBN 978-0-268-03832-8. Paper, $32.00, ISBN 978-0-268-03833-5.) [J] . Michelle A. Gonzalez The Journal of American History . 2008,第1期

机译：美国的古巴天主教徒，1960–1980年：流亡与融合。杰拉尔德·E·波约（Gerald E. Poyo）。（巴黎圣母院：巴黎圣母大学出版社，2007年。xiv，367页。布，$ 65.00，ISBN 978-0-268-03832-8。纸张，$ 32.00，ISBN 978-0-268-03833-5。）
3. The Holocaust and Catholic Conscience: Cardinal Aloisius Muench and the Guilt Question in Germany, Suzanne Brown-Fleming (Notre Dame, IN: University of Notre Dame Press in association with the United States Holocaust Memorial Museum, 2006), xvi + 240pp., cloth $45.00, pbk. $20.00. [J] . Donald J. Dietrich Holocaust and Genocide Studies . 2007,第2期

机译：大屠杀和天主教良知：德国的枢机主教Aloisius Muench和罪恶感问题，苏珊娜·布朗-弗莱明（Suzanne Brown-Fleming）（印第安那圣母院：圣母大学出版社与美国大屠杀纪念馆合影，2006年），xvi + 240pp。，布$ 45.00，pbk。 20.00美元。
4. UNIVERSITY OF NOTRE DAME: Industrial CO_2 and Emission Control Strategy Development Today University of Notre Dame CO_2 Strategy "Now and in the Future" - (PPT) [C] . P. Kempf Industrial Emissions Control Technology Conference . 2008

机译：Notre Dame大学：工业CO_2和排放控制战略发展今天Notre Dame Co_2战略“现在和未来” - （PPT）
5. Voice of the Cathedral: Sound and Space in Twelfth-Century Notre-Dame of Paris [D] . Morgan, Kacie. 2021

机译：大教堂的声音：巴黎十二世纪Notre-Dame的声音和空间
6. The Isotopic Signature of Lead Emanations during the Fire at Notre Dame Cathedral in Paris France [O] . Philippe Glorennec, Aurélia Azema, Séverine Durand, 2021

机译：在巴黎法国巴黎的火炉举行的引线发射的同位素签名
7. L. ZAGZEBSKI (ed.), Rational Faith. Catholic Responses to Reformed Epistemology, Notre Dame University Press, Notre Dame 1993, VI + 290 pp., 15,5 x 23, 5. [O] . Conesa Francisco 2018

机译：L. ZAGZEBSKI（ed。），理性信念。天主教对改革认识论的回应，巴黎圣母大学出版社，巴黎圣母院1993，VI + 290 pp。，15,5 x 23，5。

Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets

摘要

著录项

相似文献

相关主题

期刊订阅