首页> 外文期刊>Statistics and computing >Multiple changepoint detection in categorical data streams
【24h】

Multiple changepoint detection in categorical data streams

机译:分类数据流中的多个变更点检测

获取原文
获取原文并翻译 | 示例
           

摘要

The need for efficient tools is pressing in the era of big data, particularly in streaming data applications. As data streams are ubiquitous, the ability to accurately detect multiple changepoints, without affecting the continuous flow of data, is an important issue. Change detection for categorical data streams is understudied, and existing work commonly introduces fixed control parameters while providing little insight into how they may be chosen. This is ill-suited to the streaming paradigm, motivating the need for an approach that introduces few parameters which may be set without requiring any prior knowledge of the stream. This paper introduces such a method, which can accurately detect changepoints in categorical data streams with fixed storage and computational requirements. The detector relies on the ability to adaptively monitor the category probabilities of a multinomial distribution, where temporal adaptivity is introduced using forgetting factors. A novel adaptive threshold is also developed which can be computed given a desired false positive rate. This method is then compared to sequential and nonsequential change detectors in a large simulation study which verifies the usefulness of our approach. A real data set consisting of nearly 40 million events from a computer network is also investigated.
机译:在大数据时代,特别是在流数据应用程序中,迫切需要高效的工具。由于数据流无处不在,因此在不影响数据连续流的情况下准确检测多个变更点的能力是一个重要的问题。对分类数据流的更改检测的研究不足,并且现有工作通常引入固定的控制参数,同时几乎不了解如何选择它们。这不适用于流传输范例,因此需要一种引入很少参数的方法,而无需任何先验知识就可以设置这些参数。本文介绍了这种方法,该方法可以准确地检测具有固定存储和计算需求的分类数据流中的变更点。检测器依赖于自适应监视多项式分布的类别概率的能力,其中使用遗忘因子引入时间适应性。还开发了一种新颖的自适应阈值,可以根据给定的误报率进行计算。然后在大型仿真研究中将该方法与顺序和非顺序更改检测器进行比较,这验证了我们方法的有效性。还研究了包含来自计算机网络的近4000万个事件的真实数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号