Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

Danilo Dessi; Francesco Osborne; Diego Reforgiato Recupero; Davide Buscaldi; Enrico Motta

首页> 外文期刊>Future generation computer systems >Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

【24h】

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

机译：通过在学术域中采用自然语言处理和机器学习技术来生成知识图表

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to help researchers, research policy makers, and companies to time-efficiently browse, analyse, and forecast scientific research. Knowledge graphs i.e., large networks of entities and relationships, have proved to be effective solution in this space. Scientific knowledge graphs focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and citations. However, the current generation of knowledge graphs lacks of an explicit representation of the knowledge presented in the research papers. As such, in this paper, we present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications and integrates them in a large-scale knowledge graph. Within this research work, we (ⅰ) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools, (ⅱ) describe an approach for integrating entities and relationships generated by these tools, (ⅲ) show the advantage of such an hybrid system over alternative approaches, and (ⅵ) as a chosen use case, we generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain. As our approach is general and can be applied to any domain, we expect that it can facilitate the management, analysis, dissemination, and processing of scientific knowledge.

机译：科学文献的持续增长带来了创新，同时提出了新的挑战。其中一个与事实有关，即由于需要手动努力和管理的发布纸张，其分析变得困难。需要新颖的技术基础设施来帮助研究人员，研究决策者和公司有效地浏览，分析和预测科学研究。知识图形即，大型实体和关系网络已经证明是在此空间中的有效解决方案。科学知识图表专注于学术域，通常包含描述研究出版物，如作者，场所，组织，研究主题和引文。然而，目前的知识图表缺乏研究论文中提出的知识的明确表示。因此，在本文中，我们提出了一种新的架构，该架构利用自然语言处理和机器学习方法，用于从研究出版物中提取实体和关系，并将它们集成在大规模的知识图中。在这项研究中，我们（Ⅰ）通过采用几种最先进的自然语言处理和文本挖掘工具来解决知识提取的挑战，（Ⅱ）描述了一种用于集成这些工具生成的实体和关系的方法（ Ⅲ）显示出这种混合系统通过替代方法的优点，（ⅵ）作为所选用例，我们产生了一种科学知识图，包括109,105个三元组，从语义Web域内的26,827个论文中提取。由于我们的方法是一般的，并且可以应用于任何领域，我们预计它可以促进科学知识的管理，分析，传播和处理。

著录项

来源
《Future generation computer systems》 |2021年第3期|253-264|共12页
作者
Danilo Dessi; Francesco Osborne; Diego Reforgiato Recupero; Davide Buscaldi; Enrico Motta;
展开▼
作者单位

Department of Mathematics and Computer Science University of Cagliari Cagliari Italy FIZ Karlsruhe - Leibniz Institute for Information Infrastructure Germany Karlsruhe Institute of Technology Institute AIFB Germany;

Knowledge Media Institute The Open University Milton Keynes UK;

Department of Mathematics and Computer Science University of Cagliari Cagliari Italy;

LIPN CNRS (UMR 7030) University Paris 13 Villetaneuse France;

Knowledge Media Institute The Open University Milton Keynes UK;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Geographical localization of web domains and organization addresses recognition by employing natural language processing, Pattern Matching and clustering [J] . Paolo Nesi, Gianni Pantaleo, Marco Tenti Engineering Applications of Artificial Intelligence . 2016,第may期

机译：Web域和组织的地理位置本地化通过采用自然语言处理，模式匹配和聚类来解决识别问题
2. Intelligent compilation of patent summaries using machine learning and natural language processing techniques [J] . Amy J.C. Trappey, Charles V. Trappey, Jheng-Long Wu, Advanced engineering informatics . 2020,第Jana期

机译：使用机器学习和自然语言处理技术对专利摘要进行智能编辑
3. MalDy: Portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports [J] . Karbab ElMouatez Billah, Debbabi Mourad Digital investigation . 2019,第APRa期

机译：MalDy：使用自然语言处理和机器学习技术对行为分析报告进行便携式，数据驱动的恶意软件检测
4. Discover trending domains using fusion of supervised machine learning with natural language processing [C] . Lakhanpal Shilpa, Gupta Ajay, Agrawal Rajeev International Conference on Information Fusion . 2015

机译：通过将有监督的机器学习与自然语言处理相融合来发现趋势领域
5. Enhancing Ontology Learning with Machine Learning and Natural Language Processing Techniques [D] . Liu, Yue. 2019

机译：加强机器学习和自然语言处理技术的本体学习
6. Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs [O] . Bhuvan Sharma, Van C Willis, Claudia S Huettner, 2020

机译：使用自然语言处理和机器学习的预测文章推荐以支持域特定知识图中的证据更新
7. Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs [O] . Bhuvan Sharma, Van C Willis, Claudia S Huettner, 2020

机译：使用自然语言处理和机器学习的预测文章推荐，以支持域特定知识图中的证据更新

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

摘要

著录项

相似文献

相关主题

期刊订阅