Borderline over-sampling for imbalanced data classification

Hien M. Nguyen; Eric W. Cooper; Katsuari Kamei

首页> 外文期刊>International journal of knowledge engineering and soft data paradigms >Borderline over-sampling for imbalanced data classification

【24h】

Borderline over-sampling for imbalanced data classification

机译：边界过采样以实现不平衡的数据分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traditional classification algorithms usually provide poor accuracy on the prediction of the minority class of imbalanced data sets. This paper proposes a new method for dealing with imbalanced data sets by over-sampling the borderline minority class instances. A Support Vector Machine (SVM) classifier is then trained to predict future instances. Compared with other over-sampling methods, the proposed method focuses only on the minority class instances residing along the decision boundary, due to the fact that this region is the most crucial for establishing the decision boundary. Furthermore, the artificial minority instances are generated in such a way that the regions of the minority class with fewer majority class instances would be expanded by extrapolation, otherwise the current boundary of the minority class would be consolidated by interpolation. Experimental results show that the proposed method achieves a better performance than other over-sampling methods.

机译：传统的分类算法通常在预测不平衡数据集的少数类别时准确性较差。本文提出了一种通过对边缘少数类实例进行过度采样来处理不平衡数据集的新方法。然后，训练支持向量机（SVM）分类器来预测将来的实例。与其他过采样方法相比，由于该区域对于建立决策边界最关键，因此该方法仅关注决策边界上的少数类实例。此外，以这样的方式生成人工少数实例：通过外推扩展少数类实例较少的少数类区域，否则将通过插值合并少数类的当前边界。实验结果表明，该方法具有比其他过采样方法更好的性能。

著录项

来源
《International journal of knowledge engineering and soft data paradigms》 |2011年第1期|p.4-21|共18页
作者
Hien M. Nguyen; Eric W. Cooper; Katsuari Kamei;
展开▼
作者单位

Graduate School of Science and Engineering, Ritsumeikan University, 1-1-1 Noji Higashi, Kusatsu, Shiga 525-8577, Japan;

College of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji Higashi, Kusatsu, Shiga 525-8577, Japan;

College of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji Higashi, Kusatsu, Shiga 525-8577, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
imbalanced data sets; over-sampling; support vector machines;

机译：数据集不平衡;过度采样;支持向量机;

相似文献

外文文献
中文文献
专利

1. Borderline Over-sampling in Feature Space for Learning Algorithms in Imbalanced Data Environments [J] . Kittipat Savetratanakaree, Kingkarn Sookhanaphibarn, Sarun Intakosum, IAENG Internaitonal journal of computer science . 2016,第3期

机译：不平衡数据环境中学习算法的特征空间边界过采样
2. K-Neighbor over-sampling with cleaning data: a new approach to improve classification performance in data sets with class imbalance [J] . Budi Santoso, Hari Wijayanto, Khairil Anwar Notodiputro, Applied mathematical sciences . 2018,第9a12期

机译：使用清洗数据进行K邻域过度采样：一种新方法，可在具有类不平衡的数据集中提高分类性能
3. Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification [J] . Mardhiya Hayaty, Siti Muthmainah, Syed Muhammad Ghufran International Journal of Artificial Intelligence Research . 2021,第2期

机译：随机和合成过采样方法来解决分类中的数据不平衡
4. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning [C] . Hui Han, Wen-Yuan Wang, Bing-Huan Mao International Conference on Advances in Intelligent Computing(ICIC 2005); 20050823-26; Hefei(CN) . 2005

机译：Borderline-SMOTE：不平衡数据集学习中的一种新的过采样方法
5. Deep Learning Based Imbalanced Data Classification and Information Retrieval for Multimedia Big Data [D] . Yan, Yilin. 2018

机译：基于深度学习的多媒体大数据不平衡数据分类与信息检索
6. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [O] . Ming Hao, Yanli Wang, Stephen H. Bryant -1

机译：一种有效的算法结合合成少数过采样技术对不平衡的PubChem BioAssay数据进行分类
7. Borderline Over-sampling for Imbalanced Data Classification [O] . Nguyen, Hien M., Cooper, Eric W., Kamei, Katsuari 2009

机译：不平衡数据分类的边界过采样

Borderline over-sampling for imbalanced data classification

摘要

著录项

相似文献

相关主题

期刊订阅