首页> 外文期刊>Smart Grid, IEEE Transactions on >Efficient Histogram Estimation for Smart Grid Data Processing With the Loglog-Bloom-Filter
【24h】

Efficient Histogram Estimation for Smart Grid Data Processing With the Loglog-Bloom-Filter

机译:使用Loglog-Bloom-Filter进行智能电网数据处理的高效直方图估计

获取原文
获取原文并翻译 | 示例
           

摘要

With the emerging area of smart grids, one critical challenge faced by administrators of wide-area measurement systems is to analyze and model streaming data with limited resources on their embedded controllers. Usually, streaming data can be modeled as a multiset where each data item has its own frequency. In this paper, we study the problem on how to generate histograms of data items based on their frequency, so we can identify various issues such as power line tripping or line faults under constraints. The primary challenge for achieving this goal using conventional methods is that keeping an individual counter for each unique type of data is too memory-consuming, slow, and costly. In this paper, we describe a novel data structure and its associated algorithms, called the loglog bloom filter, for this purpose. This data structure extends the classical bloom filter with a recent technique called probabilistic counting, so it can effectively generate histograms for streaming data in one pass with sub-linear overhead. Therefore, this method is suitable for data processing in smart grids, where limited computational resources are available on the controllers. We analyze the performance, trade-offs, and capacity of this data structure, and evaluate it with real data traces collected through the frequency disturbance recorders deployed for the FNET/GridEye infrastructure. We demonstrate that this method can identify the frequencies of all unique items with high accuracy and low memory overhead, so that data outliers can be conveniently identified.
机译:随着智能电网的兴起,广域测量系统的管理员面临的一个关键挑战是在嵌入式控制器上使用有限的资源来分析和建模流数据。通常,流数据可以建模为多集,其中每个数据项都有自己的频率。在本文中,我们研究了如何根据数据项的频率生成直方图的问题,因此我们可以识别各种问题,例如约束条件下的电力线路跳闸或线路故障。使用常规方法实现此目标的主要挑战是,为每种唯一类型的数据保留单独的计数器会占用大量内存,速度缓慢且成本很高。在本文中,我们为此目的描述了一种新颖的数据结构及其相关算法,称为loglog Bloom过滤器。该数据结构使用一种称为概率计数的最新技术扩展了经典布隆过滤器,因此它可以有效地生成直方图,以便在一次通过中以亚线性开销传输数据。因此,该方法适用于智能电网中的数据处理,在智能电网中,控制器上可用的计算资源有限。我们分析此数据结构的性能,权衡和容量,并使用通过为FNET / GridEye基础架构部署的频率干扰记录器收集的真实数据轨迹评估它。我们证明了该方法可以以高精度和低内存开销识别所有唯一项的频率,从而可以方便地识别数据异常值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号