【24h】

A Collaborative Decentralized Approach to Web Search

机译:协作分散式网络搜索方法

获取原文
获取原文并翻译 | 示例
           

摘要

Most explanations of the user behavior while interacting with the web are based on a top-down approach, where the entire Web, viewed as a vast collection of pages and interconnection links, is used to predict how the users interact with it. A prominent example of this approach is the random-surfer model, the core ingredient behind Google's PageRank. This model exploits the linking structure of the Web to estimate the percentage of web surfers viewing any given page. Contrary to the top-down approach, a bottom-up approach starts from the user and incrementally builds the dynamics of the web as the result of the users' interaction with it. The second approach has not being widely investigated, although there are numerous advantages over the top-down approach regarding (at least) personalization and decentralization of the required infrastructure for web tools. In this paper, we propose a bottom-up approach to study the web dynamics based on web-related data browsed, collected, tagged, and semi-organized by end users. Our approach has been materialized into a hybrid bottom-up search engine that produces search results based solely on user provided web-related data and their sharing among users. We conduct an extensive experimental study to demonstrate the qualitative and quantitative characteristics of user generated web-related data, their strength, and weaknesses as well as to compare the search results of our bottom-up search engine with those of a traditional one. Our study shows that a bottom-up search engine starts from a core consisting of the most interesting part of the Web (according to user opinions) and incrementally (and measurably) improves its ranking, coverage, and accuracy. Finally, we discuss how our approach can be integrated with PageRank, resulting in a new page ranking algorithm that can uniquely combine link analysis with users' preferences.
机译:对用户与Web交互时行为的大多数解释都是基于自顶向下的方法,其中整个Web被视为大量的页面和互连链接,用于预测用户如何与Web交互。这种方法的一个突出例子是随机冲浪者模型,它是Google PageRank背后的核心要素。该模型利用Web的链接结构来估计浏览任何给定页面的Web冲浪者的百分比。与自上而下的方法相反,自下而上的方法从用户开始,并随着用户与Web交互而逐步构建Web的动态。尽管自上而下的方法在(至少)对Web工具所需的基础结构进行个性化和分散化方面具有许多优势,但第二种方法尚未得到广泛研究。在本文中,我们提出了一种自下而上的方法来研究基于最终用户浏览,收集,标记和半组织的与Web相关的数据的Web动力学。我们的方法已被实现为一种混合的自下而上的搜索引擎,该引擎仅基于用户提供的Web相关数据及其在用户之间的共享来生成搜索结果。我们进行了广泛的实验研究,以证明用户生成的与网络相关的数据的定性和定量特征,其优缺点,并将自底向上搜索引擎的搜索结果与传统搜索引擎的搜索结果进行比较。我们的研究表明,自下而上的搜索引擎从包含Web最有趣部分(根据用户意见)的核心开始,并且逐步(且可衡量地)提高了其排名,覆盖范围和准确性。最后,我们讨论了如何将我们的方法与PageRank集成在一起,从而产生了一种新的页面排名算法,该算法可以将链接分析与用户的偏好进行唯一结合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号