首页> 外国专利> Collapsed gibbs sampler for sparse topic models and discrete matrix factorization

Collapsed gibbs sampler for sparse topic models and discrete matrix factorization

机译:折叠的gibbs采样器,用于稀疏主题模型和离散矩阵分解

摘要

In an inference system for organizing a corpus of objects, feature representations are generated comprising distributions over a set of features corresponding to the objects. A topic model defining a set of topics is inferred by performing latent Dirichlet allocation (LDA) with an Indian Buffet Process (IBP) compound Dirichlet prior probability distribution. The inference is performed using a collapsed Gibbs sampling algorithm by iteratively sampling (1) topic allocation variables of the LDA and (2) binary activation variables of the IBP compound Dirichlet prior. In some embodiments the inference is configured such that each inferred topic model is a clean topic model with topics defined as distributions over sub-sets of the set of features selected by the prior. In some embodiments the inference is configured such that the inferred topic model associates a focused sub-set of the set of topics to each object of the training corpus.
机译:在用于组织对象语料库的推理系统中,生成特征表示,其包括在对应于对象的一组特征上的分布。通过使用印度自助过程(IBP)复合Dirichlet先验概率分布执行潜在Dirichlet分配(LDA),可以推断出定义一组主题的主题模型。通过折叠(1)LDA的主题分配变量和(2)IBP化合物Dirichlet先前的二元激活变量进行迭代采样,使用折叠的Gibbs采样算法执行推断。在一些实施例中,推断被配置为使得每个推断的主题模型是干净的主题模型,其主题被定义为由先验选择的特征集的子集上的分布。在一些实施例中,推断被配置为使得推断的主题模型将主题集合的集中子集与训练语料库的每个对象相关联。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号