首页> 外文会议>Advances in data and web management >Bag of Timestamps: A Simple and Efficient Bayesian Chronological Mining
【24h】

Bag of Timestamps: A Simple and Efficient Bayesian Chronological Mining

机译:时间戳记包:一种简单高效的贝叶斯年表挖掘

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose a new probabilistic model, Bag of Timestamps (BoT), for chronological text mining. BoT is an extension of latent Dirichlet allocation (LDA), and has two remarkable features when compared with a previously proposed Topics over Time (ToT), which is also an extension of LDA. First, we can avoid overfitting to temporal data, because temporal data are modeled in a Bayesian manner similar to word frequencies. Second, BoT has a conditional probability where no functions requiring time-consuming computations appear. The experiments using newswire documents show that BoT achieves more moderate fitting to temporal data in shorter execution time than ToT.
机译:在本文中,我们提出了一种用于按时间顺序进行文本挖掘的新概率模型,即“时间戳记袋”(BoT)。 BoT是潜在Dirichlet分配(LDA)的扩展,并且与先前提出的随时间变化的主题(ToT)相比,它具有两个显着特征,后者也是LDA的扩展。首先,我们可以避免过度拟合时态数据,因为时态数据是以类似于词频的贝叶斯方式建模的。其次,BoT具有条件概率,其中没有需要耗时计算的函数出现。使用新闻专线文档进行的实验表明,与ToT相比,BoT在更短的执行时间内可以更适度地适应时间数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号