首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Toward Automated Anomaly Identification in Large-Scale Systems
【24h】

Toward Automated Anomaly Identification in Large-Scale Systems

机译:走向大型系统中的自动异常识别

获取原文
获取原文并翻译 | 示例
           

摘要

When a system fails to function properly, health-related data are collected for troubleshooting. However, it is challenging to effectively identify anomalies from the voluminous amount of noisy, high-dimensional data. The traditional manual approach is time-consuming, error-prone, and even worse, not scalable. In this paper, we present an automated mechanism for node-level anomaly identification in large-scale systems. A set of techniques is presented to automatically analyze collected data: data transformation to construct a uniform data format for data analysis, feature extraction to reduce data size, and unsupervised learning to detect the nodes acting differently from others. Moreover, we compare two techniques, principal component analysis (PCA) and independent component analysis (ICA), for feature extraction. We evaluate our prototype implementation by injecting a variety of faults into a production system at NCSA. The results show that our mechanism, in particular, the one using ICA-based feature extraction, can effectively identify faulty nodes with high accuracy and low computation overhead.
机译:当系统无法正常运行时,将收集与健康相关的数据以进行故障排除。然而,从大量的嘈杂的高维数据中有效识别异常是一项挑战。传统的手动方法耗时,容易出错,甚至更糟,无法扩展。在本文中,我们提出了一种用于大型系统中节点级异常识别的自动机制。提出了一套技术来自动分析收集的数据:数据转换以构建用于数据分析的统一数据格式,特征提取以减小数据大小以及无监督学习以检测节点的行为不同于其他节点。此外,我们比较了两种技术:主成分分析(PCA)和独立成分分析(ICA),用于特征提取。我们通过将各种故障注入到NCSA的生产系统中来评估原型实现。结果表明,我们的机制,特别是使用基于ICA的特征提取的机制,可以以较高的准确度和较低的计算开销有效地识别故障节点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号