Learning representations of Web entities for entity resolution

Luciano Barbosa

首页> 外文期刊>International journal of web information systems >Learning representations of Web entities for entity resolution

【24h】

Learning representations of Web entities for entity resolution

机译：学习实体分辨率的Web实体的表示

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Purpose - Matching instances of the same entity, a task known as entity resolution, is a key step in the process of data integration. This paper aims to propose a deep learning network that learns different representations of Web entities for entity resolution. Design/methodology/approach - To match Web entities, the proposed network learns the following representations of entities: embeddings, which are vector representations of the words in the entities in a low-dimensional space; convolutional vectors from a convolutional layer, which capture short-distance patterns in word sequences in the entities; and bag-of-word vectors, created by a bow layer that learns weights for words in the vocabulary based on the task at hand. Given a pair of entities, the similarity between their learned representations is used as a feature to a binary classifier that identifies a possible match. In addition to those features, the classifier also uses a modification of inverse document frequency for pairs, which identifies discriminative words in pairs of entities. Findings - The proposed approach was evaluated in two commercial and two academic entity resolution benchmarking data sets. The results have shown that the proposed strategy outperforms previous approaches in the commercial data sets, which are more challenging, and have similar results to its competitors in the academic data sets. Originality/value - No previous work has used a single deep learning framework to learn different representations of Web entities for entity resolution.

机译：目的 - 同一实体的匹配实例，称为实体分辨率的任务是数据集成过程中的一个关键步骤。本文旨在提出一个深入的学习网络，了解实体解析的不同表示的Web实体的不同表示。设计/方法/方法 - 要匹配Web实体，所提出的网络了解以下实体的表示：嵌入式，它是低维空间中实体中单词的矢量表示;来自卷积层的卷积载体，其在实体中的单词序列中捕获短距离模式;和文字袋矢量，由弓形层创建，该弓层基于手头的任务学习词汇中的单词的权重。给定一对实体，他们学习的表示之间的相似性用作标识可能匹配的二进制分类器的特征。除了这些特征之外，分类器还使用对成对的逆文档频率的修改，这识别成对实体的判别词。调查结果 - 在两个商业和两个学术实体分辨率基准数据集中评估了所提出的方法。结果表明，拟议的策略优于商业数据集中的先前方法，这些方法更具挑战性，并且对学术数据集中的竞争对手具有类似的结果。原创性/值 - 未以前的工作使用单个深度学习框架来了解实体解析的Web实体的不同表示。

著录项

来源
《International journal of web information systems》 |2019年第3期|346-358|共13页
作者
Luciano Barbosa;
展开▼
作者单位

Universidade Federal de Pernambuco Recife Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Entity resolution; Representation learning; Web entity;

机译：实体分辨率;代表学习;网页实体;

相似文献

外文文献
中文文献
专利

1. Learning representations of Web entities for entity resolution [J] . Luciano Barbosa International journal of web information systems . 2019,第3期

机译：学习Web实体的表示形式以进行实体解析
2. The Representation of a Multimedia Franchise as a Single Entity: Contrasting Existing Bibliographic Entities With Web-Based Superwork Portrayals [J] . Senan Kiryakos, Shigeo Sugimoto Libres: Library and Information Science Research Electronic Journal . 2018,第2期

机译：多媒体特许经营权作为一个单一实体的表示形式：现有的书目实体与基于Web的超级作品描述的对比
3. Learning entity-centric document representations using an entity facet topic model [J] . Chuan Wu, Evangelos Kanoulas, Maarten de Rijke Information Processing & Management . 2020,第3期

机译：使用实体构面主题模型学习以实体为中心的文档表示形式
4. Big data entity resolution: From highly to somehow similar entity descriptions in the Web [C] . Efthymiou Vasilis, Stefanidis Kostas, Christophides Vassilis IEEE International Congress on Big Data . 2015

机译：大数据实体解析：从高度到某种程度上Web中相似的实体描述
5. Design and construction of an entity resolution system that supports entity identity information management and asserted resolution. [D] . Nelson, Eric Derrand. 2011

机译：支持实体身份信息管理和断言解析的实体解析系统的设计和构建。
6. Learning adaptive representations for entity recognition in the biomedical domain [O] . Ivano Lauriola, Fabio Aiolli, Alberto Lavelli, 2021

机译：学习生物医学域中实体识别的自适应表示
7. Declarative Entity Resolution Via Matching Dependencies and Combining Matching Dependencies With Machine Learning for Entity Resolution [O] . Zeinab Bahmani -1

机译：通过匹配依赖项和组合与实体分辨率的机器学习的匹配依赖项的声明性实体分辨率
8. Entity Came to Rescue - Leveraging Entities to Minimize Risks in Web Search. [R] . Liu, X., Yang, P., Fang, H. 2014

机译：实体拯救 - 利用实体最大限度地减少网络搜索中的风险。

Learning representations of Web entities for entity resolution

摘要

著录项

相似文献

相关主题

期刊订阅