【24h】

Cross-lingual Wikification Using Multilingual Embeddings

机译:使用多语言嵌入的跨语言Wikification

获取原文

摘要

Cross-lingual Wikification is the task of grounding mentions written in non-English documents to entries in the English Wikipedia. This task involves the problem of comparing textual clues across languages, which requires developing a notion of similarity between text snippets across languages. In this paper, we address this problem by jointly training multilingual embeddings for words and Wikipedia titles. The proposed method can be applied to all languages represented in Wikipedia, including those for which no machine translation technology is available. We create a challenging dataset in 12 languages and show that our proposed approach outperforms various baselines. Moreover, our model compares favorably with the best systems on the TAC KBP2015 Entity Linking task including those that relied on the availability of translation from the target language to English.
机译:跨语言Wikification是将非英语文档中的提及与英语Wikipedia中的条目进行扎根的任务。此任务涉及比较跨语言的文本线索的问题,这需要在跨语言的文本片段之间建立相似性概念。在本文中,我们通过联合训练单词和Wikipedia标题的多语言嵌入来解决此问题。所提出的方法可以应用于以Wikipedia表示的所有语言,包括那些没有机器翻译技术的语言。我们用12种语言创建了具有挑战性的数据集,并表明我们提出的方法优于各种基准。此外,我们的模型可与TAC KBP2015实体链接任务上的最佳系统进行比较,包括那些依赖于从目标语言到英语的翻译可用性的最佳系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号