首页> 外文期刊>ACM SIGIR FORUM >Non-negative Matrix Factorization MeetsWord Embedding
【24h】

Non-negative Matrix Factorization MeetsWord Embedding

机译:非负矩阵分解满足词嵌入

获取原文
获取原文并翻译 | 示例
           

摘要

Document clustering is central in modern information retrievalrnapplications. Among existing models, non-negative-matrix factorizationrn(NMF) approaches have proven e‚ective for this task.rnHowever, NMF approaches, like other models in this context, exhibitrna major drawback, namely they use the bag-of-word representationrnand, thus, do not account for the sequential order in whichrnwords occur in documents. Œis is an important issue since it mayrnresult in a signi€cant loss of semantics. In this paper, we aim to addressrnthe above issue and propose a new model which successfullyrnintegrates a word embedding model, word2vec, into an NMF frameworkrnso as to leverage the semantic relationships between words.rnEmpirical results, on several real-world datasets, demonstrate thernbene€ts of our model in terms of text document clustering as wellrnas document/word embedding.
机译:文档聚类在现代信息检索应用中至关重要。在现有模型中,非负矩阵分解方法(NMF)已证明是有效的方法。然而,与本文中的其他模型一样,NMF方法也存在主要缺点,即它们使用了词袋表示法,因此,不要考虑单词在文档中出现的顺序。这是一个重要的问题,因为它可能导致语义的重大损失。在本文中,我们旨在解决上述问题,并提出一个新模型,该模型成功地将单词嵌入模型word2vec集成到NMF框架中,从而利用单词之间的语义关系。在一些实际数据集上的经验结果证明了这一点。我们在文本文档聚类以及Wellrnas文档/单词嵌入方面的模型模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号