...
首页> 外文期刊>Knowledge and information systems >Tackling heterogeneous concept drift with the Self-Adjusting Memory (SAM)
【24h】

Tackling heterogeneous concept drift with the Self-Adjusting Memory (SAM)

机译:使用自我调整记忆(SAM)来解决异构概念漂移

获取原文
获取原文并翻译 | 示例
           

摘要

Data mining in non-stationary data streams is particularly relevant in the context of Internet of Things and Big Data. Its challenges arise from fundamentally different drift types violating assumptions of data independence or stationarity. Available methods often struggle with certain forms of drift or require unavailable a priori task knowledge. We propose the Self-Adjusting Memory (SAM) model for the k-nearest-neighbor (kNN) algorithm. SAM-kNN can deal with heterogeneous concept drift, i.e., different drift types and rates, using biologically inspired memory models and their coordination. Its basic idea is to have dedicated models for current and former concepts used according to the demands of the given situation. It can be easily applied in practice without meta parameter optimization. We conduct an extensive evaluation on various benchmarks, consisting of artificial streams with known drift characteristics and real-world datasets. We explicitly add new benchmarks enabling a precise performance analysis on multiple types of drift. Highly competitive results throughout all experiments underline the robustness of SAM-kNN as well as its capability to handle heterogeneous concept drift. Knowledge about drift characteristics in streaming data is not only crucial for a precise algorithm evaluation, but it also facilitates the choice of an appropriate algorithm on real-world applications. Therefore, we additionally propose two tests, able to determine the type and strength of drift. We extract the drift characteristics of all utilized datasets and use it for our analysis of the SAM in relation to other methods.
机译:非静止数据流中的数据挖掘在事物互联网和大数据的背景下特别相关。它的挑战从根本上不同的漂移类型侵犯了数据独立性或实体性的假设。可用方法通常以某种形式的漂移斗争,或者需要不可用的先验任务知识。我们提出了用于K到最近邻居(KNN)算法的自调整存储器(SAM)模型。 SAM-KNN可以使用生物启发的记忆模型及其协调来处理异质概念漂移,即不同漂移类型和速率。其基本思想是为当前和以前的概念进行专用模型,根据给定情况的要求使用。在没有元参数优化的情况下,可以轻松应用。我们对各种基准进行了广泛的评估,包括具有已知漂移特性和现实世界数据集的人工流。我们明确地添加了新的基准测试,可对多种类型的漂移进行精确的性能分析。在所有实验中具有高竞争力的结果强调了SAM-KNN的鲁棒性以及其处理异构概念漂移的能力。关于流数据中的漂移特性的知识不仅对于精确算法评估至关重要,而且还促进了对现实世界应用的适当算法。因此,我们还提出了两次测试,能够确定漂移的类型和强度。我们提取所有使用数据集的漂移特性,并使用它来与其他方法的分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号