...
首页> 外文期刊>Journal of Theoretical and Applied Information Technology >TECHNIQUES FOR HANDLING IMBALANCED DATASETS WHEN PRODUCING CLASSIFIER MODELS
【24h】

TECHNIQUES FOR HANDLING IMBALANCED DATASETS WHEN PRODUCING CLASSIFIER MODELS

机译:产生分类器模型时处理不平衡数据集的技术

获取原文
           

摘要

Imbalanced datasets are a well-known problem in data mining, where the datasets are composed of two classes; the majority class and minority class. A majority class has more instances compared to the minority class. Recent years have brought increased interest in handling imbalanced datasets since many datasets produced are naturally imbalanced. Most existing techniques for classifying data ignore the imbalanced condition, but focused on the accuracy of the model produced where it is biased to the majority class while giving poor accuracy towards the minority class. Although the minority class is something that rarely happens, but in some conditions it will give an important influence to the classifier model. This paper attempts to list all the techniques in handling imbalanced datasets, as well as to compare all the techniques for producing the best classifier model for imbalanced datasets. These techniques have been categorized into sampling, feature selection and algorithmic approaches in the form of a taxonomy for handling imbalanced datasets. The strengths and the weaknesses of these approaches will be discussed in order to identify an appropriate technique that will improve the performance of a classifier model produced. The recent trends in handling imbalanced datasets also will be discussed based on domain and problems exist in dataset.
机译:不平衡的数据集是数据挖掘中的一个众所周知的问题,其中数据集由两类组成:多数阶层和少数阶层。与少数派相比,多数派具有更多的实例。由于产生的许多数据集自然是不平衡的,因此近年来引起了人们对处理不平衡数据集的更多兴趣。现有的大多数数据分类技术都忽略了不平衡状况,而是集中在所产生模型的准确性上,该模型偏向多数类,而对少数类的准确性却很差。尽管少数类很少发生,但是在某些情况下它将对分类器模型产生重要影响。本文试图列出处理不平衡数据集的所有技术,并比较所有为不平衡数据集生成最佳分类器模型的技术。这些技术已按照分类法的形式分为采样,特征选择和算法方法,用于处理不平衡的数据集。将讨论这些方法的优点和缺点,以便确定将改善所生成分类器模型性能的适当技术。还将根据域和数据集中存在的问题来讨论处理不平衡数据集的最新趋势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号