首页> 外文会议>IEEE International Conference on Computer and Communications >Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter
【24h】

Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter

机译:内存紧凑的成员查找由单个盛开过滤器的多个数据集

获取原文

摘要

Bloom filter is a memory-compact data structure to encode a set of data items, which can address the set membership query with no false negative and a configurable false positive rate. It is a fundamental tool with a wide range of applications in multiple disciplines, such as data science, networking, computer architecture, and distributed computing. However, Bloom filter faces a challenge of memory allocation: How much memory should be given to its data structure when its encoded data set is dynamically formed and has no prior-known set size. As a result, when more set elements continuously arrives, its data structure will become more crowded, causing its false positive rate of addressing membership query to increase. This problem becomes even more challenging, when there are multiple data sets to represent and each data set is independently formed in a streaming fashion. The traditional way to support the set membership checking for multiple data sets is to allocate each data set a separate Bloom filter. Instead, this paper takes a dramatically different approach: We encode all data sets in a single large filter and yet supports membership lookup for all of them, with a false positive rate bound that is independently configurable for each set. We analyze the properties of the filter and, in particular, the formulas for its feasible region where the false positive rate requirements are met for all data sets.
机译:Bloom Filter是一个内存紧凑的数据结构,用于编码一组数据项,它可以在没有假阴性和可配置的假阳性率的情况下解决集合成员资格查询。它是一种基本工具,具有多个学科的广泛应用,如数据科学,网络,计算机架构和分布式计算。然而,绽放过滤器面临内存分配的挑战:当其编码的数据集动态形成并且没有先前已知的集合大小时,应给其数据结构给出多少存储器。因此,当更多设定元素连续到达时,其数据结构将变得更加拥挤,导致其误阳性率解决成员查询增加。当有多个数据集以表示并且每个数据集以流时尚独立地形成时,该问题变得更具挑战性。支持多个数据集的设置成员资格检查的传统方式是分配每个数据集单独的绽放过滤器。相反,本文采取了显着不同的方法:我们在单个大型过滤器中编码所有数据集,但为所有数据都支持了所有数据查找,具有针对每个集合独立配置的误报率绑定。我们分析过滤器的特性,特别是其可行区域的公式,其中所有数据集满足了假阳性率要求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号