Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter

机译：通过单个Bloom筛选器对多个数据集进行内存紧凑的成员资格查找

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Bloom filter is a memory-compact data structure to encode a set of data items, which can address the set membership query with no false negative and a configurable false positive rate. It is a fundamental tool with a wide range of applications in multiple disciplines, such as data science, networking, computer architecture, and distributed computing. However, Bloom filter faces a challenge of memory allocation: How much memory should be given to its data structure when its encoded data set is dynamically formed and has no prior-known set size. As a result, when more set elements continuously arrives, its data structure will become more crowded, causing its false positive rate of addressing membership query to increase. This problem becomes even more challenging, when there are multiple data sets to represent and each data set is independently formed in a streaming fashion. The traditional way to support the set membership checking for multiple data sets is to allocate each data set a separate Bloom filter. Instead, this paper takes a dramatically different approach: We encode all data sets in a single large filter and yet supports membership lookup for all of them, with a false positive rate bound that is independently configurable for each set. We analyze the properties of the filter and, in particular, the formulas for its feasible region where the false positive rate requirements are met for all data sets.

机译：布隆过滤器是一种存储紧凑的数据结构，用于对一组数据项进行编码，该数据项可以以无误报和可配置的误报率解决该组成员资格查询。它是一种基本工具，在数据科学，网络，计算机体系结构和分布式计算等多个学科中具有广泛的应用。但是，Bloom过滤器面临内存分配的挑战：当动态地形成其编码数据集并且没有已知的集合大小时，应为其数据结构分配多少内存。结果，当更多集合元素连续到达时，其数据结构将变得更加拥挤，从而导致其寻址成员查询的误报率增加。当要表示多个数据集并且每个数据集以流方式独立形成时，此问题将变得更加具有挑战性。支持多个数据集的集合成员资格检查的传统方法是为每个数据集分配一个单独的Bloom筛选器。取而代之的是，本文采用了截然不同的方法：我们在单个大型过滤器中对所有数据集进行编码，但仍支持所有数据集的成员资格查找，每个数据集都可以独立配置误报率范围。我们分析了过滤器的属性，尤其是分析了其中所有数据集都满足误报率要求的可行区域的公式。

著录项

来源
《IEEE International Conference on Computer and Communications》|2018年|1798-1802|共5页
会议地点
作者
Jiyang Chen; Qingjun Xiao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Arrays; Encoding; Memory management; Hash functions; Silicon; Classification algorithms;

机译：数组;编码;内存管理;哈希函数;硅;分类算法;

相似文献

外文文献
中文文献
专利

1. Fast Dynamic Multiple-Set Membership Testing Using Combinatorial Bloom Filters [J] . Hao F., Kodialam M., Lakshman T. V., Networking, IEEE/ACM Transactions on . 2012,第1期

机译：使用组合布隆过滤器的快速动态多集成员资格测试
2. Noisy Bloom Filters for Multi-Set Membership Testing [J] . Haipeng Dai, Yuankun Zhong, Alex X. Liu, Performance evaluation review . 2016,第1期

机译：嘈杂的布隆过滤器，用于多组成员资格测试
3. Sum-Up Counting Bloom Filter-Based Name Lookup Method for Named Data Networking [J] . Tingting Wu, Lang Zhang, Jianyun Lei, Recent advances in electrical & electronic engineering . 2018,第2期

机译：总结计数盛开的基于筛选筛选筛选器的名称查找方法，用于命名数据网络
4. Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter [C] . Jiyang Chen, Qingjun Xiao IEEE International Conference on Computer and Communications . 2018

机译：内存紧凑的成员查找由单个盛开过滤器的多个数据集
5. Extracting patterns from large movement data sets using Hybrid Spatio-temporal Filtering: A case study of geovisual analytics in support of fisheries enforcement activities. [D] . Enguehard, Rene A. 2011

机译：使用混合时空过滤从大型运动数据集中提取模式：以地理视觉分析为例，以支持渔业执法活动。
6. Pattern Matching for DNA Sequencing Data Using Multiple Bloom Filters [O] . Maleeha Najam, Raihan Ur Rasool, Hafiz Farooq Ahmad, 2006

机译：使用多个Bloom过滤器进行DNA测序数据的模式匹配
7. Steerable Name Lookup based on Classified Prefixes and Scalable One Memory Access Bloom Filter for Named Data Networking [O] . Sheng Huang, Jianghua Xu, Xiaofei Yang, 2016

机译：基于分类前缀和可扩展的一个内存访问盛开过滤器的可操纵名称查找用于命名数据网络

Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter

摘要

著录项

相似文献

相关主题

期刊订阅