首页> 中文期刊> 《计算机技术与发展》 >类不平衡稀疏重构度量学习软件缺陷预测

类不平衡稀疏重构度量学习软件缺陷预测

         

摘要

Software defect prediction ( SDP) is an important method to improve the quality of software. Currently many latest results from machine learning has been applied to improve the performance of defect prediction. However,imbalance of class distribution usually exists in SDP dataset,which might affect the prediction performance. For this, we propose a novel software defect prediction method termed class-imbalance sparse reconstruction metric learning ( CSRML) . In CSRML,by introducing cost-sensitive factor into metric learning,a feature matrix of distance metric can be learned and the problem of different cost of misclassification can also be solved. And weight pa-rameter is added in objective function to further improve the accuracy of the small class samples distance metric learning. Finally,im-proved weighted KNN ( IWKNN) method is employed to predict the label of test sample for tackling class imbalance in prediction phase. Experiment on the NASA SDP dataset demonstrates that the proposed method can improve the recall rate, F-measure value and classifi-cation performance.%软件缺陷预测是提升软件质量的重要手段.为了改善缺陷预测性能,目前许多机器学习领域的最新成果已经引入到软件缺陷预测中.但是,软件缺陷预测数据通常存在类别分布不平衡的问题,这会影响预测效果.针对这个问题,提出了类不平衡稀疏重构距离度量学习软件缺陷预测方法.该方法首先在度量学习中加入代价敏感因素,学习距离度量特征矩阵并解决软件缺陷预测中分类错误代价不同的问题.其次,通过在目标函数中加入权重来进一步提高小类样本距离度量学习的准确性.最后,为了解决预测阶段数据集的类别不平衡问题,采用了改进加权KNN算法预测测试样本标签.在NASA软件缺陷预测标准数据集上的实验结果证明了该方法能提高召回率与F-measure值,改善分类性能.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号