首页> 外文会议>IEEE International Symposium on Software Reliability Engineering >The Impact of Feature Selection on Defect Prediction Performance: An Empirical Comparison
【24h】

The Impact of Feature Selection on Defect Prediction Performance: An Empirical Comparison

机译:特征选择对缺陷预测性能的影响:实证比较

获取原文

摘要

Software defect prediction aims to determine whether a software module is defect-prone by constructing prediction models. The performance of such models is susceptible to the high dimensionality of the datasets that may include irrelevant and redundant features. Feature selection is applied to alleviate this issue. Because many feature selection methods have been proposed, there is an imperative need to analyze and compare these methods. Prior empirical studies may have potential controversies and limitations, such as the contradictory results, usage of private datasets and inappropriate statistical test techniques. This observation leads us to conduct a careful empirical study to reinforce the confidence of the experimental conclusions by considering several potential source of bias, such as the noise in the dataset and the dataset types. In this paper, we investigate the impact of 32 feature selection methods on the defect prediction performance over two versions of the NASA dataset (i.e., the noisy and clean NASA datasets) and one open source AEEEM dataset. We use a state-of-the-art double Scott-Knott test technique to analyze these methods. Experimental results show that the effectiveness of these feature selection methods on defect prediction performance varies significantly over all the datasets.
机译:软件缺陷预测旨在通过构建预测模型来确定软件模块是否容易出现缺陷。此类模型的性能容易受到数据集的高维度的影响,其中可能包括不相关和多余的特征。应用功能选择可缓解此问题。因为已经提出了许多特征选择方法,所以迫切需要分析和比较这些方法。先前的经验研究可能存在潜在的争议和局限性,例如矛盾的结果,私有数据集的使用和不适当的统计测试技术。该观察结果使我们进行了认真的经验研究,通过考虑几种潜在的偏差来源(例如数据集中的噪声和数据集类型)来增强实验结论的可信度。在本文中,我们研究了32种特征选择方法对两种版本的NASA数据集(即嘈杂和干净的NASA数据集)和一个开源AEEEM数据集的缺陷预测性能的影响。我们使用最先进的双重Scott-Knott测试技术来分析这些方法。实验结果表明,这些特征选择方法对缺陷预测性能的有效性在所有数据集上均存在显着差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号