首页> 外文会议>Future Technologies Conference >Nonlinear semantic space based on lexical graph
【24h】

Nonlinear semantic space based on lexical graph

机译:基于词汇图的非线性语义空间

获取原文

摘要

Distributional semantics approaches such as LSA/LSI, HAL, or Eigenwords provide low-dimensional vector space representations of words, so that similarity between words can be measured. However, they come from the statistics of words derived from natural texts where both lexical and contextual similarities are mixed together-synonyms, antonyms, or context-related word pairs are all close to each other. We present a semantic representation based purely on the lexical relations of synonymy and antonymy. Laplacian Embedding transforms a synonymy/antonymy graph into a space where lexically similar words form distinct semantic branches with antonyms placed at opposite sides from the vector space origin. This approach also discovers highly related words that are either novel or missing from the thesaurus. The structures become more distinct after applying Independent Component Analysis (ICA); each dimension clearly represents distinct semantic content, with a word sense as its emergent property. Finally, a comparison of several semantic vector space representations is provided against human evaluated word similarity scores.
机译:分类语义方法,如LSA / LSI,Hal或突出词,提供单词的低维矢量空间表示,从而可以测量单词之间的相似性。然而,它们来自来自自然文本的词语的统计数据,其中词汇和上下文相似性都在一起混合 - 同义词,反义词或上下文相关的词对彼此完全靠近。我们纯粹对同义词和对抗的词汇关系呈现了一个语义表示。拉普拉斯嵌入将一个同义词/反义的图形变为一个空间,其中Lexly类似的单词形成不同的语义分支,与矢量空间原点的相对侧的反义词形成不同的语义分支。这种方法还发现了从词库中的新颖或缺失的高度相关词汇。在应用独立分量分析(ICA)后,结构变得更加明确;每个维度都清楚地代表了明显的语义内容,具有单词感觉作为其紧急属性。最后,提供了几种语义矢量空间表示的比较,用于针对人类评估的单词相似度得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号