首页> 外文会议>International Conference on Intelligent Computer Mathematics >Explorations into the Use of Word Embedding in Math Search and Math Semantics
【24h】

Explorations into the Use of Word Embedding in Math Search and Math Semantics

机译:在数学搜索和数学语义学中使用词嵌入的探索

获取原文

摘要

Word embedding, which represents individual words with semantically rich numerical vectors, has made it possible to successfully apply deep learning to NLP tasks such as semantic role modeling, question answering, and machine translation. As math text consists of natural text as well as math expressions that similarly exhibit linear correlation and contextual characteristics, word embedding can be applied to math documents as well. On the other hand, math terms also show characteristics (e.g., abstractions) that are different from textual words. Accordingly, it is worthwhile to explore the use and effectiveness of word embedding in math language processing and MKM. In this paper, we present exploratory investigations of math embedding by testing it on some basic tasks such as (1) math-term similarity, (2) analogy, (3) basic numerical concept-modeling using a novel approach based on computing the (weighted) centroid of the keywords that characterize a concept, and (4) math search, especially query expansion using the weighted centroid of the query keywords and then expanding the query with new keywords that are most similar to the centroid. Due to lack of benchmarks, our investigations were done using carefully selected illustrations on the DLMF. We draw from our investigations some general observations and lessons that form a trajectory for future statistically significant testing on large benchmarks. Our preliminary results and observations show that math embedding holds much promise but also point to the need for more robust embedding.
机译:词嵌入(代表具有语义丰富的数字矢量的单个词)使将深度学习成功应用于NLP任务成为可能,例如语义角色建模,问题回答和机器翻译。由于数学文本包含自然文本以及类似地表现出线性相关性和上下文特征的数学表达式,因此单词嵌入也可以应用于数学文档。另一方面,数学术语也表现出与文本单词不同的特征(例如抽象)。因此,有必要探索单词嵌入在数学语言处理和MKM中的使用和有效性。在本文中,我们通过在一些基本任务上进行测试来对数学嵌入进行探索性研究,例如(1)数学术语相似度,(2)类比,(3)使用基于计算(代表概念的关键字的加权)质心,以及(4)数学搜索,尤其是使用查询关键字的加权质心进行查询扩展,然后使用与质心最相似的新关键字扩展查询。由于缺乏基准,我们使用精心挑选的DLMF插图进行了调查。我们从调查中得出一些一般性的观察和教训,这些趋势和教训构成了将来在大型基准上进行具有统计意义的测试的轨迹。我们的初步结果和观察结果表明,数学嵌入具有广阔的前景,但同时也指出了对更强大的嵌入的需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号