首页> 外文会议>International Conference on Language Resources and Evaluation >Representation Learning for Unseen Words by Bridging Subwords to Semantic Networks
【24h】

Representation Learning for Unseen Words by Bridging Subwords to Semantic Networks

机译:通过将子字缩小到语义网络来学习未经说明的语言

获取原文

摘要

Pre-trained word embeddings are widely used in various fields. However, the coverage of pre-trained word embeddings only includes words that appeared in corpora where pre-trained embeddings are learned. It means that the words which do not appear in training corpus are ignored in tasks, and it could lead to the limited performance of neural models. In this paper, we propose a simple yet effective method to represent out-of-vocabulary (OOV) words. Unlike prior works that solely utilize subword information or knowledge, our method makes use of both information to represent OOV words. To this end, we propose two stages of representation learning. In the first stage, we learn subword embeddings from the pre-trained word embeddings by using an additive composition function of subwords. In the second stage, we map the learned subwords into semantic networks (e.g., WordNet). We then re-train the subword embeddings by using lexical entries on semantic lexicons that could include newly observed subwords. This two-stage learning makes the coverage of words broaden to a great extent. The experimental results clearly show that our method provides consistent performance improvements over strong baselines that use subwords or lexical resources separately.
机译:训练有素的单词嵌入式广泛用于各个领域。但是,预先训练的单词嵌入的覆盖范围仅包括在学习预训练嵌入的Corpora中出现的单词。这意味着在任务中忽略了培训语料库中没有出现的单词,它可能导致神经模型的性能有限。在本文中,我们提出了一种简单但有效的方法来代表词汇流(OOV)词。与仅利用子字信息或知识的先前作品不同,我们的方法利用两个信息来表示OOV字。为此,我们提出了两个代表学习的阶段。在第一阶段,我们通过使用子字的添加剂组合函数来从预先训练的单词嵌入中学习子字嵌入。在第二阶段,我们将学习的子字映射到语义网络(例如,Wordnet)中。然后,我们通过使用可以包括新观察到的子字的语义词汇上的词汇条目来重新列车嵌入式嵌入式。这个两阶段学习使词的覆盖范围扩大到很大程度上。实验结果清楚地表明,我们的方法提供了一致的性能改进,而不是分别使用子字或词汇资源的强大基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号