首页> 外文会议>Conference on Multimedia Computing and Networking; 20080130-31; San Jose,CA(US) >Understanding the Practical Limits of the Gnutella P2P System: An Analysis of Query Terms and Object Name Distributions
【24h】

Understanding the Practical Limits of the Gnutella P2P System: An Analysis of Query Terms and Object Name Distributions

机译:了解Gnutella P2P系统的实际限制:查询术语和对象名称分布的分析

获取原文
获取原文并翻译 | 示例

摘要

A number of prior efforts analyzed the behavior of popular peer-to-peer (P2P) systems and proposed ways for maintaining the overlays as well as methods for searching for contents using these overlays. However, little was known about how successful users could be in locating the shared objects in these system. There might be a mismatch between the way content creators named objects and the way such objects were queried by the consumers. Our aim was to examine the terms used in the queries and shared object names in the Gnutella file-sharing system. We analyzed the object names of over 20 million objects collected from 40,000 peers as well as terms from over 230,000 queries. We observed that almost half (44.4%) of the queries had no matching objects in the system regardless of the overlay or search mechanism used to locate the objects. We also evaluated the query success rates against random peer groups of various sizes (200, 1K, 2K, 3K, 4K, 5K, 10K and 20K peers sampled from the full 40,000 peers). We showed that the success rates increased rapidly from 200 to 5,000 peers, but only exhibited modest improvements when increasing the number of peers beyond 5,000. Finally, we observed Zipf-like distribution for query terms and the object names. However, the relative popularity of a term in the object names did not correlate with the terms popularity in the query workload. This observation affected the ability of hybrid P2P systems to guide searches by creating a synopsis of the peer object names. A synopsis created by using the distribution of terms in the object names need not represent relevant terms for the query. Our results can be used to guide the design of future P2P systems that are optimized for the observed object names and user query behavior.
机译:许多先前的努力分析了流行的点对点(P2P)系统的行为,并提出了维护覆盖层的方法以及使用这些覆盖层搜索内容的方法。但是,对于成功的用户如何在这些系统中定位共享对象知之甚少。内容创建者命名对象的方式与消费者查询此类对象的方式可能不匹配。我们的目的是检查Gnutella文件共享系统中查询和共享库名称中使用的术语。我们分析了从40,000个对等点收集的超过2000万个对象的对象名称,以及从超过230,000个查询中获得的术语。我们观察到,几乎有一半(44.4%)的查询在系统中没有匹配的对象,无论用于定位对象的叠加或搜索机制如何。我们还针对各种规模的随机对等体组(从40,000个对等体中抽样的200、1K,2K,3K,4K,5K,10K和20K对等体)评估了查询成功率。我们显示,成功率从200个同行迅速增加到5,000个,但只有将同行数目增加到5,000个以上时,才显示出适度的改善。最后,我们观察到查询词和对象名称的类Zipf分布。但是,对象名称中术语的相对流行度与查询工作负载中的术语流行度不相关。这种观察影响了混合P2P系统通过创建对等对象名称的提要来指导搜索的能力。通过使用对象名称中的术语分布创建的提要不必代表查询的相关术语。我们的结果可用于指导未来的P2P系统的设计,这些系统已针对观察到的对象名称和用户查询行为进行了优化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号