首页> 外文期刊>Computer speech and language >Integrated concept blending with vector space models
【24h】

Integrated concept blending with vector space models

机译:与矢量空间模型融合的集成概念

获取原文
获取原文并翻译 | 示例
           

摘要

Traditional concept retrieval is based on usual word definition dictionaries with simple performance: they just map words to their definitions. This approach is mostly helpful for readers and language students, but writers sometimes need to find a word that encompasses a set of ideas that they have in mind. For this task, inverse dictionaries are ready to help; however, in some cases a sought word does not correspond to a single definition but to a composite meaning of several concepts. A language producer then tends to require a concept search that starts with a group of words or a series of related terms, looking for a target word. This paper aims to assist on this task by presenting a new approach for concept blending through the development of a search-by-concept method based on vector space representation using semantic analysis and statistical natural language processing techniques. Words are represented as numeric vectors based on different semantic similarity measures and probabilistic measures; the semantic properties of a word are captured in the vector elements determined by a given linguistic context. Three different sources are used as context for word vector construction: WordNet, a distributional thesaurus, and the Latent Dirichlet Allocation algorithm; each source is used for building a different semantic vector space. The concept-blender input is then conformed by a set of n-nouns. All input members are read and substituted by their corresponding vectors. Then, a semantic space analysis including a filtering and ranking process is carried out to deploy a list of target words. A test set of 50 concepts was created in order to evaluate the system's performance. A group of 30 evaluators found our integrated concept blending model to provide better results for finding an adequate word for the provided set of concepts.
机译:传统的概念检索基于具有简单性能的常用单词定义词典:它们只是将单词映射到其定义。这种方法对读者和语言学生最有用,但是作家有时需要找到一个包含他们所构想的单词。对于此任务,逆字典随时可以提供帮助。但是,在某些情况下,所搜寻的单词并不对应于单个定义,而是对应于多个概念的组合含义。然后,语言生产者倾向于要求以一组词或一系列相关术语开头的概念搜索,以寻找目标词。本文旨在通过提出一种新的概念融合方法来协助完成这一任务,方法是通过使用语义分析和统计自然语言处理技术开发基于向量空间表示的按概念搜索方法。根据不同的语义相似性度量和概率度量,将单词表示为数字向量。单词的语义属性被捕获在由给定语言上下文确定的向量元素中。三种不同的来源用作构建词向量的上下文:WordNet,分布词库和Latent Dirichlet分配算法;每个源都用于构建不同的语义向量空间。然后,概念混合器的输入由一组n名词组成。读取所有输入成员,并用其相应的向量替换。然后,进行包含过滤和排序过程的语义空间分析以部署目标单词列表。为了评估系统性能,创建了一个包含50个概念的测试集。一组30位评估人员发现了我们的综合概念融合模型,可以为为所提供的概念集找到合适的词提供更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号