Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter

机译：内存紧凑的成员查找由单个盛开过滤器的多个数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Bloom filter is a memory-compact data structure to encode a set of data items, which can address the set membership query with no false negative and a configurable false positive rate. It is a fundamental tool with a wide range of applications in multiple disciplines, such as data science, networking, computer architecture, and distributed computing. However, Bloom filter faces a challenge of memory allocation: How much memory should be given to its data structure when its encoded data set is dynamically formed and has no prior-known set size. As a result, when more set elements continuously arrives, its data structure will become more crowded, causing its false positive rate of addressing membership query to increase. This problem becomes even more challenging, when there are multiple data sets to represent and each data set is independently formed in a streaming fashion. The traditional way to support the set membership checking for multiple data sets is to allocate each data set a separate Bloom filter. Instead, this paper takes a dramatically different approach: We encode all data sets in a single large filter and yet supports membership lookup for all of them, with a false positive rate bound that is independently configurable for each set. We analyze the properties of the filter and, in particular, the formulas for its feasible region where the false positive rate requirements are met for all data sets.

机译：Bloom Filter是一个内存紧凑的数据结构，用于编码一组数据项，它可以在没有假阴性和可配置的假阳性率的情况下解决集合成员资格查询。它是一种基本工具，具有多个学科的广泛应用，如数据科学，网络，计算机架构和分布式计算。然而，绽放过滤器面临内存分配的挑战：当其编码的数据集动态形成并且没有先前已知的集合大小时，应给其数据结构给出多少存储器。因此，当更多设定元素连续到达时，其数据结构将变得更加拥挤，导致其误阳性率解决成员查询增加。当有多个数据集以表示并且每个数据集以流时尚独立地形成时，该问题变得更具挑战性。支持多个数据集的设置成员资格检查的传统方式是分配每个数据集单独的绽放过滤器。相反，本文采取了显着不同的方法：我们在单个大型过滤器中编码所有数据集，但为所有数据都支持了所有数据查找，具有针对每个集合独立配置的误报率绑定。我们分析过滤器的特性，特别是其可行区域的公式，其中所有数据集满足了假阳性率要求。

著录项

来源
《IEEE International Conference on Computer and Communications》|2018年|1 v.|共5页
会议地点
作者
Jiyang Chen; Qingjun Xiao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Arrays; Encoding; Memory management; Hash functions; Silicon; Classification algorithms;

机译：阵列;编码;内存管理;哈希函数;硅;分类算法;

相似文献

外文文献
中文文献
专利

1. Fast Dynamic Multiple-Set Membership Testing Using Combinatorial Bloom Filters [J] . Hao F., Kodialam M., Lakshman T. V., Networking, IEEE/ACM Transactions on . 2012,第1期

机译：使用组合布隆过滤器的快速动态多集成员资格测试
2. Noisy Bloom Filters for Multi-Set Membership Testing [J] . Haipeng Dai, Yuankun Zhong, Alex X. Liu, Performance evaluation review . 2016,第1期

机译：嘈杂的布隆过滤器，用于多组成员资格测试
3. Sum-Up Counting Bloom Filter-Based Name Lookup Method for Named Data Networking [J] . Tingting Wu, Lang Zhang, Jianyun Lei, Recent advances in electrical & electronic engineering . 2018,第2期

机译：总结计数盛开的基于筛选筛选筛选器的名称查找方法，用于命名数据网络
4. Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter [C] . Jiyang Chen, Qingjun Xiao IEEE International Conference on Computer and Communications . 2018

机译：通过单个Bloom筛选器对多个数据集进行内存紧凑的成员资格查找
5. Extracting patterns from large movement data sets using Hybrid Spatio-temporal Filtering: A case study of geovisual analytics in support of fisheries enforcement activities. [D] . Enguehard, Rene A. 2011

机译：使用混合时空过滤从大型运动数据集中提取模式：以地理视觉分析为例，以支持渔业执法活动。
6. Pattern Matching for DNA Sequencing Data Using Multiple Bloom Filters [O] . Maleeha Najam, Raihan Ur Rasool, Hafiz Farooq Ahmad, 2006

机译：使用多个Bloom过滤器进行DNA测序数据的模式匹配
7. Steerable Name Lookup based on Classified Prefixes and Scalable One Memory Access Bloom Filter for Named Data Networking [O] . Sheng Huang, Jianghua Xu, Xiaofei Yang, 2016

机译：基于分类前缀和可扩展的一个内存访问盛开过滤器的可操纵名称查找用于命名数据网络

Memory-Compact Membership Lookup for Multiple Data Sets by a Single Bloom Filter

摘要

著录项

相似文献

相关主题

期刊订阅