...
首页> 外文期刊>Progress in Artificial Intelligence >Surrounding neighborhood-based SMOTE for learning from imbalanced data sets
【24h】

Surrounding neighborhood-based SMOTE for learning from imbalanced data sets

机译:基于周围邻域的SMOTE,可从不平衡数据集中学习

获取原文
获取原文并翻译 | 示例
           

摘要

Many traditional approaches to pattern classification assume that the problem classes share similar prior probabilities. However, in many real-life applications, this assumption is grossly violated. Often, the ratios of prior probabilities between classes are extremely skewed. This situation is known as the class imbalance problem. One of the strategies to tackle this problem consists of balancing the classes by resampling the original data set. The SMOTE algorithm is probably the most popular technique to increase the size of theminority class by generating synthetic instances. From the idea of the original SMOTE, we here propose the use of three approaches to surrounding neighborhood with the aim of generating artificial minority instances, but taking into account both the proximity and the spatial distribution of the examples. Experiments over a large collection of databases and using three different classifiers demonstrate that the new surrounding neighborhood-based SMOTE procedures significantly outperform other existing over-sampling algorithms.
机译:模式分类的许多传统方法都假设问题类别具有相似的先验概率。但是,在许多实际应用中,这一假设已被严重违反。通常,类别之间的先验概率之比极度偏斜。这种情况称为类不平衡问题。解决此问题的策略之一是通过对原始数据集进行重新采样来平衡类。 SMOTE算法可能是通过生成合成实例来增加少数群体人数的最流行技术。根据原始SMOTE的思想,我们在此建议使用三种方法来围绕周围环境,以生成人工少数派实例,但同时要考虑示例的邻近性和空间分布。在大量数据库上进行的实验以及使用三个不同的分类器表明,新的基于周围邻域的SMOTE程序明显优于其他现有的过采样算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号