首页> 外文期刊>International journal of open source software & processes >A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction
【24h】

A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction

机译:类不平衡特征选择与软件可靠性预测集成研究

获取原文
获取原文并翻译 | 示例
           

摘要

Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.
机译:早期的软件缺陷预测模型可以提高软件质量。但是,由于缺陷表示不足和用于预测缺陷的不相关度量导致的类不平衡是阻碍模型性能的两个主要挑战。本文介绍了一个新的两阶段混合特征选择集成(EHF)与加权支持向量机增强(WSVMBoost)框架,该框架进一步增强了模型性能。 EHF是特征选择模型(例如用于选择相关度量的过滤器和嵌入式模型)的整体特征等级。在这项研究中,还使用五个软件可靠性数据集探索了分类集合(即随机森林,RUSBoost,WSVMBoost)和基础学习者(决策树和SVM)。从统计测试来看,在预测软件缺陷方面,带有WSVMBoost的EHF在性能方面比其他功能选择混合设备获得了最佳平均排名。此外,这项研究表明,McCabe和Hasalted方法级别的度量标准在提高模型性能方面均同等重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号