首页> 外文期刊>Knowledge and information systems >Framework for extreme imbalance classification: SWIM-sampling with the majority class
【24h】

Framework for extreme imbalance classification: SWIM-sampling with the majority class

机译:极端不平衡分类的框架:与大多数类的游泳抽样

获取原文
获取原文并翻译 | 示例
           

摘要

The class imbalance problem is a pervasive issue in many real-world domains. Oversampling methods that inflate the rare class by generating synthetic data are amongst the most popular techniques for resolving class imbalance. However, they concentrate on the characteristics of the minority class and use them to guide the oversampling process. By completely overlooking the majority class, they lose a global view on the classification problem and, while alleviating the class imbalance, may negatively impact learnability by generating borderline or overlapping instances. This becomes even more critical when facing extreme class imbalance, where the minority class is strongly underrepresented and on its own does not contain enough information to conduct the oversampling process. We propose a framework for synthetic oversampling that, unlike existing resampling methods, is robust on cases of extreme imbalance. The key feature of the framework is that it uses the density of the well-sampled majority class to guide the generation process. We demonstrate implementations of the framework using the Mahalanobis distance and a radial basis function. We evaluate over 25 benchmark datasets and show that the framework offers a distinct performance improvement over the existing state-of-the-art in oversampling techniques.
机译:班级不平衡问题是许多真实域名的普遍存在问题。通过产生合成数据来膨胀珍稀级别的过采样方法是解决类别不平衡的最流行的技术。但是,它们专注于少数阶级的特征,并使用它们来指导过采样过程。通过完全俯瞰大多数类,他们对分类问题失去了全球性观点,同时减轻了阶级不平衡,可能通过产生边界或重叠的实例来消极地影响学习。在面对极端阶级的不平衡时,这变得更加重要,其中少数阶级阶级强烈强名,并且自己不包含足够的信息来进行过采样过程。我们提出了一个合成过采样的框架,与现有重采样方法不同,对极端不平衡的情况是强大的。框架的关键特征是它使用良好采样的多数类的密度来指导生成过程。我们展示了使用Mahalanobis距离和径向基函数的框架的实现。我们评估了超过25个基准数据集,并表明该框架在过采样技术中对现有的现有技术提供了不同的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号