首页> 外文会议>Database Systems for Advanced Applications >Example-Based Robust DB-Outlier Detection for High Dimensional Data
【24h】

Example-Based Robust DB-Outlier Detection for High Dimensional Data

机译:高维数据基于示例的鲁棒DB异常检测

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a method of outlier detection to identify exceptional objects that match user intentions in high dimensional datasets. Outlier detection is a crucial element of many applications like financial analysis and fraud detection. Scholars have made numerous investigations, but the results show that current methods fail to directly discover outliers from high dimensional datasets due to the curse of dimensionality. Beyond that, many algorithms require several decisive parameters to be predefined. Such vital parameters are considerably difficult to determine without identifying datasets beforehand. To address these problems, we take an Example-Based approach and examine behaviors of projections of the outlier examples in a dataset. An example-based approach is promising, since users are probably able to provide a few outlier examples to suggest what they want to detect. An important point is that the method should be robust, even if user-provided examples include noises or inconsistencies. Our proposed method is based on the notion of DB- (Distance-Based) Outliers. Experiments demonstrate that our proposed method is effective and efficient on both synthetic and real datasets and can tolerate noise examples.
机译:本文提出了一种异常检测方法,以识别与高维数据集中的用户意图相匹配的异常对象。离群检测是许多应用程序(例如财务分析和欺诈检测)的关键要素。学者们进行了许多研究,但结果表明,由于维数的诅咒,当前方法无法直接从高维数据集中发现异常值。除此之外,许多算法需要预先定义几个决定性参数。如果不事先识别数据集,很难确定这些重要参数。为了解决这些问题,我们采用基于示例的方法,并检查数据集中异常示例的投影行为。基于示例的方法很有希望,因为用户可能能够提供一些异常示例来建议他们想要检测的内容。重要的一点是,即使用户提供的示例包括噪音或不一致性,该方法也应具有鲁棒性。我们提出的方法基于DB(基于距离)离群值的概念。实验表明,我们提出的方法在合成数据集和真实数据集上都是有效且有效的,并且可以容忍噪声示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号