【24h】

Exploring Numeracy in Word Embeddings

机译:探讨嵌入词中的算法

获取原文

摘要

Word embeddings are now pervasive across NLP subfields as the de-facto method of forming text representataions. In this work, we show that existing embedding models are inadequate at constructing representations that capture salient aspects of mathematical meaning for numbers, which is important for language understanding. Numbers are ubiquitous and frequently appear in text. Inspired by cognitive studies on how humans perceive numbers, we develop an analysis framework to test how well word embeddings capture two essential properties of numbers: magnitude (e.g. 3<4) and numeration (e.g. 3=three). Our experiments reveal that most models capture an approximate notion of magnitude, but are inadequate at capturing numeration. We hope that our observations provide a starting point for the development of methods which better capture numeracy in NLP systems.
机译:Word Embeddings现在跨越NLP子场普遍存在,作为形成文本替代品的De-Facto方法。在这项工作中,我们表明,在构建数学意义的突出方面,现有的嵌入模型不足以捕获数学意义的数学意义,这对语言理解很重要。数字普遍存在,经常出现在文本中。灵感来自认知研究对人类如何感知号码,我们开发一个分析框架来测试Word Embeddings如何捕获数字的两个基本属性:幅度(例如3 <4)和数量(例如3 =三)。我们的实验表明,大多数模型都捕获了近似占概念的概念,但在捕获数量时不足。我们希望我们的观察结果为开发方法提供了更好地捕获NLP系统的方法的起点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号