首页> 外国专利> AUC-maximized high-accuracy classifier for imbalanced datasets

AUC-maximized high-accuracy classifier for imbalanced datasets

机译:AUC-最大化的高精度分类,用于非衡度数据集

摘要

An AUC-maximized high-accuracy classification method and system for imbalanced datasets integrates an under-sampling-and-ensemble strategy, a true-outliers-removing strategy and a fake-outliers-concealing strategy, with the hope to effectively and robustly enhance both the AUC and the accuracy metrics in imbalanced classification. Applying under-sampling to construct multiple sub-datasets and assembling classification results of multiple classifiers greatly decline the risk of misclassification and lead to highly accurate and robust results in imbalanced classification task. Moreover, this invention pays attention to detect and identify extremely hidden outliers in a sub-dataset which includes a sub-majority dataset and the entire minority dataset. In this way, more hidden outliers can be located and thus exert less influence on the decision boundary, which contributes to both high AUC and accuracy. Furthermore, this invention proposes to conceal fake outliers when building decision boundary, which can achieve a higher classification accuracy of the majority class without changing that of the minority class.
机译:AUC最大化的高精度分类方法和系统用于非衡度数据集,集成了一个欠采样和集合的策略,真正的异常值删除策略和假异点隐藏策略,希望有效且强大地增强不平衡分类中的AUC和准确度指标。应用下采样构造多个子数据集并组装多个分类器的分类结果大大拒绝错误分类的风险,并导致不平衡的分类任务中的高度准确和强大的结果。此外,本发明注意在子数据集中检测和识别包括子多数数据集和整个少数群体数据集的子数据集中的极其隐藏的异常值。以这种方式,可以定位更多的隐藏异常值,因此对决策边界产生不太影响,这有助于高AUC和准确性。此外,本发明提出在建立决策边界时隐藏假异常,这可以在不改变少数阶级的情况下实现大多数类的更高分类精度。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号