Fast Topic Discovery From Web Search Streams

机译：通过Web搜索流快速发现主题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Web search involves voluminous data streams that record millions of users' interactions with the search engine. Recently latent topics in web search data have been found to be critical for a wide range of search engine applications such as search personalization and search history warehousing. However, the existing methods usually discover latent topics from web search data in an offline and retrospective fashion. Hence, they are increasingly ineffective in the face of the ever-increasing web search data that accumulate in the format of online streams. In this paper, we propose a novel probabilistic topic model, the Web Search Stream Model (WSSM), which is delicately calibrated for handling two salient features of the web search data: it is in the format of streams and in massive volume. We further propose an efficient parameter inference method, the Stream Parameter Inference (SPI) to efficiently train WSSM with massive web search streams. Based on a large-scale search engine query log, we conduct extensive experiments to verify the effectiveness and efficiency of WSSM and SPI. We observe that WSSM together with SPI discovers latent topics from web search streams faster than the state-of-the-art methods while retaining a comparable topic modeling accuracy.

机译：Web搜索涉及大量数据流，这些数据流记录了数百万用户与搜索引擎的互动。最近发现，网络搜索数据中的潜在主题对于广泛的搜索引擎应用（例如搜索个性化和搜索历史仓库）至关重要。但是，现有方法通常以脱机和追溯方式从Web搜索数据中发现潜在主题。因此，面对以在线流格式累积的不断增长的Web搜索数据，它们的效率越来越低。在本文中，我们提出了一种新颖的概率主题模型，即Web搜索流模型（WSSM），该模型经过精心校准以处理Web搜索数据的两个显着特征：它以流的形式且数量庞大。我们进一步提出了一种有效的参数推断方法，即流参数推断（SPI），以通过大量的Web搜索流有效地训练WSSM。基于大规模搜索引擎查询日志，我们进行了广泛的实验，以验证WSSM和SPI的有效性和效率。我们观察到，WSSM与SPI一起从Web搜索流中发现潜在主题的速度比最新方法快，同时保持了可比拟的主题建模准确性。

著录项

来源
《International conference on world wide web》|2014年|949-959|共11页
会议地点
作者
Di Jiang; Kenneth Wai-Ting Leung; Wilfred Ng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Web Search; Query Log; Probabilistic Topic Model;

机译：网络搜索;查询日志;概率主题模型;

相似文献

外文文献
中文文献
专利

1. SG-WSTD: A framework for scalable geographic web search topic discovery [J] . Jiang Di, Vosecky Jan, Leung Kenneth Wai-Ting, Knowledge-Based Systems . 2015,第auga期

机译：SG-WSTD：用于可扩展的地理网络搜索主题发现的框架
2. TOPIC BASED QUERY SUGGESTION USING HIDDEN TOPIC MODEL FOR EFFECTIVE WEB SEARCH [J] . M.BARATHI, S.VALLI Journal of Theoretical and Applied Information Technology . 2014,第3期

机译：使用隐藏主题模型进行有效Web搜索的基于主题的查询建议
3. Query types and search topics of German Web search engine users [J] . Dirk Lewandowski Information Services & Use . 2006,第4期

机译：德国网络搜索引擎用户的查询类型和搜索主题
4. Fast Topic Discovery From Web Search Streams [C] . Di Jiang, Kenneth Wai-Ting Leung, Wilfred Ng International conference on world wide web . 2014

机译：来自Web搜索流的快速主题发现
5. Profiling topics on the Web for knowledge discovery. [D] . Sehgal, Aditya Kumar. 2007

机译：在Web上分析主题以进行知识发现。
6. Characterizing Interdisciplinarity of Researchers and Research Topics Using Web Search Engines [O] . Hiroki Sayama, Jin Akaishi 2009

机译：使用Web搜索引擎表征研究人员和研究主题的跨学科性
7. Query types and search topics of German Web search engine users [O] . Lewandowski Dirk 2006

机译：德国网络搜索引擎用户的查询类型和搜索主题

Fast Topic Discovery From Web Search Streams

摘要

著录项

相似文献

相关主题

期刊订阅