Two time-efficient gibbs sampling inference algorithms for biterm topic model

Zhou Xiaotang; Ouyang Jihong; Li Ximing

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >Two time-efficient gibbs sampling inference algorithms for biterm topic model

【24h】

Two time-efficient gibbs sampling inference algorithms for biterm topic model

机译：Biterm主题模型的两个Quey-Questive Gibbs采样推理算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Biterm Topic Model (BTM) is an effective topic model proposed to handle short texts. However, its standard gibbs sampling inference method (StdBTM) costs much more time than that (StdLDA) of Latent Dirichlet Allocation (LDA). To solve this problem we propose two time-efficient gibbs sampling inference methods, SparseBTM and ESparseBTM, for BTM by making a tradeoff between space and time consumption in this paper. The idea of SparseBTM is to reduce the computation in StdBTM by both recycling intermediate results and utilizing the sparsity of count matrix . Theoretically, SparseBTM reduces the time complexity of StdBTM from O(|B| K) to O(|B| K (w) ) which scales linearly with the sparsity of count matrix (K (w) ) instead of the number of topics (K) (K (w) K, K (w) is the average number of non-zero topics per word type in count matrix ). Experimental results have shown that in good conditions SparseBTM is approximately 18 times faster than StdBTM. Compared with SparseBTM, ESparseBTM is a more time-efficient gibbs sampling inference method proposed based on SparseBTM. The idea of ESparseBTM is to reduce more computation by recycling more intermediate results through rearranging biterm sequence. In theory, ESparseBTM reduces the time complexity of SparseBTM from O(|B|K (w) ) to O(R|B|K (w) ) (0 R 1, R is the ratio of the number of biterm types to the number of biterms). Experimental results have shown that the percentage of the time efficiency improved by ESparseBTM on SparseBTM is between 6.4% and 39.5% according to different datasets.

机译：Biterm主题模型（BTM）是一个有效的主题模型，用于处理短文本。然而，其标准的GIBBS采样推理方法（STDBTM）的成本比潜在的Dirichlet分配（LDA）的时间更多的时间更多。为了解决这个问题，我们通过在本文中的空间和时间消耗之间进行权衡，提出了两次Quate效率的GIBBS采样推论方法，SPARSASTBTM和ESPARSEBTM，用于BTM。 SparseBtm的想法是通过回收中间结果并利用计数矩阵的稀疏性来减少STDBTM的计算。理论上，SparseBTM将STDBTM的时间复杂度降低到O（| B | K）至O（| B | K（W）），其与计数矩阵的稀稀条（K（W））而不是主题的数量（ k）（k（w）＆ k，k（w）是计数矩阵中每个单词类型的非零主题的平均数量）。实验结果表明，在良好的条件下，SparseBtm的速度比STDBTM快约18倍。与SparseBtm相比，EsparseBtm是一种基于SparseBtm的更高效率的GIBBS采样推理方法。 EsparseBtm的想法是通过重新排列苯定法通过重新排列更高的中间结果来减少更多的计算。理论上，EsparseBTM将来自O（B | K（W））至O（R | B | K（W））（0＆ 1，R是数量的比例BENRERM类型为BITERMS的数量）。实验结果表明，根据不同的数据集，ESParseBTM改善的时间效率的百分比增加了6.4％和39.5％。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2018年第3期|共25页
作者
Zhou Xiaotang; Ouyang Jihong; Li Ximing;
展开▼
作者单位

Jilin Univ Coll Comp Sci &

Technol 2699 Qianjin St Changchun Jilin Peoples R China;

Jilin Univ Coll Comp Sci &

Technol 2699 Qianjin St Changchun Jilin Peoples R China;

Jilin Univ Coll Comp Sci &

Technol 2699 Qianjin St Changchun Jilin Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
Biterm topic model; Topic model; Latent Dirichlet allocation; Gibbs sampling;

机译：沥青主题模型;主题模型;潜在的dirichlet分配;吉布斯抽样;

相似文献

外文文献
中文文献
专利

1. Two time-efficient gibbs sampling inference algorithms for biterm topic model [J] . Zhou Xiaotang, Ouyang Jihong, Li Ximing Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2018,第3期

机译：Biterm主题模型的两个Quey-Questive Gibbs采样推理算法
2. FastBTM: Reducing the sampling time for biterm topic model [J] . He Xingwei, Xu Hua, Li Jia, Knowledge-Based Systems . 2017,第sepa15期

机译：FastBTM：减少双项主题模型的采样时间
3. Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collapsed Gibbs Sampling Inference Process [J] . Bambang Subeno, Retno Kusumaningrum, Farikhin Farikhin International Journal of Electrical and Computer Engineering . 2018,第5期

机译：潜在Dirichlet分配的优化：其主题号和折叠的Gibbs抽样推断过程
4. Optimize collapsed Gibbs sampling for biterm topic model by alias method [C] . Xingwei He, Hua Xu, Xiaomin Sun, International Joint Conference on Neural Networks . 2017

机译：通过别名方法优化双项主题模型的折叠Gibbs采样
5. Topics on Bayesian Inference Sampling Algorithms [D] . Zhuo, Bumeng. 2020

机译：贝叶斯推理采样算法的主题
6. Bayesian inference in threshold models using Gibbs sampling [O] . DA Sorensen, S Andersen, D Gianola, 1995

机译：使用Gibbs采样的阈值模型中的贝叶斯推断
7. Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collapsed Gibbs Sampling Inference Process [O] . Bambang Subeno, Retno Kusumaningrum, Farikhin Farikhin 2018

机译：优化潜伏的Dirichlet分配：其主题号和折叠GIBBS采样推理过程

Two time-efficient gibbs sampling inference algorithms for biterm topic model

摘要

著录项

相似文献

相关主题

期刊订阅