...
首页> 外文期刊>The Open Electrical & Electronic Engineering Journal >An Improved Random Forest Algorithm for Class-Imbalanced DataClassification and its Application in PAD Risk Factors Analysis
【24h】

An Improved Random Forest Algorithm for Class-Imbalanced DataClassification and its Application in PAD Risk Factors Analysis

机译:一种改进的随机森林分类不平衡数据分类算法及其在PAD危险因素分析中的应用

获取原文
           

摘要

The classification problem is one of the important research subjects in the field of machine learning. However,most machine learning algorithms train a classifier based on the assumption that the number of training examples ofclasses is almost equal. When a classifier was trained on imbalanced data, the performance of the classifier declinedclearly. For resolving the class-imbalanced problem, an improved random forest algorithm was proposed based on samplingwith replacement. We extracted multiple example subsets randomly with replacement from majority class, and theexample number of extracted example subsets is as the same with minority class example dataset. Then, multiple new trainingdatasets were constructed by combining the each exacted majority example subset and minority class dataset respectively,and multiple random forest classifiers were training on these training dataset. For a prediction example, the class wasdetermined by majority voting of multiple random forest classifiers. The experimental results on five groups UCI datasetsand a real clinical dataset show that the proposed method could deal with the class-imbalanced data problem and the improvedrandom forest algorithm outperformed original random forest and other methods in literatures.
机译:分类问题是机器学习领域的重要研究课题之一。但是,大多数机器学习算法都是基于这样的假设来训练分类器的:类的训练示例数几乎相等。当对分类器进行不平衡数据训练时,分类器的性能明显下降。为了解决类不平衡问题,提出了一种基于采样替换的改进的随机森林算法。我们从多数类中随机抽取了多个样本子集进行替换,而样本中子集的样本数与少数类中的样本数据集相同。然后,分别结合每个精确的多数样本子集和少数类数据集,构造了多个新的训练数据集,并在这些训练数据集上训练了多个随机森林分类器。对于一个预测示例,该类别由多个随机森林分类器的多数表决确定。在五组UCI数据集和真实临床数据集上的实验结果表明,该方法可以解决类不平衡数据问题,改进的随机森林算法优于文献中的原始随机森林方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号