Software Fault Proneness Prediction with Group Lasso Regression: On Factors that Affect Classification Performance

机译：卢赛索回归组软件故障展向预测：关于影响分类性能的因素

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine learning algorithms have been used extensively for software fault proneness prediction. This paper presents the first application of Group Lasso Regression (G-Lasso) for software fault proneness classification and compares its performance to six widely used machine learning algorithms. Furthermore, we explore the effects of two factors on the prediction performance: the effect of imbalance treatment using the Synthetic Minority Over-sampling Technique (SMOTE), and the effect of datasets used in building the prediction models. Our experimental results are based on 22 datasets extracted from open source projects. The main findings include: (1) G-Lasso is robust to imbalanced data and significantly outperforms the other machine learning algorithms with respect to the Recall and G-Score, i.e., the harmonic mean of Recall and (1- False Positive Rate). (2) Even though SMOTE improved the performance of all learners, it did not have statistically significant effect on G-Lasso's Recall and G-Score. Random Forest was in the top performing group of learners for all performance metrics, while Naive Bayes performed the worst of all learners. (3) When using the same change metrics as features, the choice of the dataset had no effect on the performance of most learners, including G-Lasso. Naive Bayes was the most affected, especially when balanced datasets were used.

机译：机器学习算法已广泛用于软件故障恒展预测。本文介绍了组套索回归（G-LASSO）的第一次应用，软件故障透明分类，并将其性能与六种广泛使用的机器学习算法进行比较。此外，我们探讨了两个因素对预测性能的影响：使用合成少数群体过采样技术（SMOTE）的不平衡处理的影响，以及用于构建预测模型的数据集的效果。我们的实验结果基于从开源项目中提取的22个数据集。主要发现包括：（1）G-LASSO对不平衡数据具有强大的鲁棒，并且对于召回和G评分，即召回的谐波平均值和（1 - 误率）显着优于其他机器学习算法。（2）尽管粉刷了所有学习者的表现，但它对G-Lasso的召回和G分数没有统计上显着的影响。随机森林在所有绩效指标中表现为学习者，而天真的贝父表现了所有学习者的最糟糕。（3）使用与功能相同的变化指标时，数据集的选择对大多数学习者的性能没有影响，包括G-LASSO。天真的贝父受到最受影响的影响，特别是在使用平衡数据集时。

著录项

来源
《IEEE Annual Computer Software and Applications Conference》|2019年|1 v.|共8页
会议地点
作者
Katerina Goseva-Popstojanova; Mohammad Ahmad; Yasser Alshehri;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词
Software; Measurement; Machine learning algorithms; Prediction algorithms; Radio frequency; Software algorithms; Predictive models;

机译：软件;测量;机器学习算法;预测算法;射频;软件算法;预测模型;

相似文献

外文文献
中文文献
专利

1. Identification of latent variables using, factor analysis and multiple linear regression for software fault prediction [J] . Deepak Sharma, Pravin Chandra International journal of systems assurance engineering and management . 2019,第6期

机译：使用因素分析和多元线性回归识别潜在变量，以进行软件故障预测
2. Dynamic Fault Prediction of Power Transformers Based on Lasso Regression and Change Point Detection by Dissolved Gas Analysis [J] . Jun Jiang, Ruyi Chen, Chaohai Zhang, Dielectrics and Electrical Insulation, IEEE Transactions on . 2020,第6期

机译：基于套索回归的电力变压器动态故障预测及溶解气体分析改变点检测
3. Retrospective Study on the Influencing Factors and Prediction of Hospitalization Expenses for Chronic Renal Failure in China Based on Random Forest and LASSO Regression [J] . Pingping Dai, Weifu Chang, Zirui Xin, Frontiers in Public Health . 2021,第a期

机译：基于随机森林和套索回归的中国慢性肾功能衰竭治疗费用影响因素及预测研究
4. Software Fault Proneness Prediction with Group Lasso Regression: On Factors that Affect Classification Performance [C] . Katerina Goseva-Popstojanova, Mohammad Ahmad, Yasser Alshehri IEEE Annual Computer Software and Applications Conference . 2019

机译：基于组套索回归的软件故障倾向性预测：影响分类性能的因素
5. Applying Social Network Analysis to Software Fault-Proneness Prediction [D] . Li, Yihao. 2017

机译：社交网络分析在软件故障率预测中的应用
6. Survival prediction in mesothelioma using a scalable Lasso regression model: instructions for use and initial performance using clinical predictors [O] . Andrew C Kidd, Michael McGettrick, Selina Tsim, 2018

机译：使用可扩展的套索回归模型在间皮瘤中进行生存预测：使用临床预测指标的使用说明和初步表现
7. Software Metrics Reduction for Fault-Proneness Prediction of Software Modules [O] . Yunfeng Luo, Kerong Ben, Lei Mi 2010

机译：软件度量标准降低软件模块的故障形态预测

Software Fault Proneness Prediction with Group Lasso Regression: On Factors that Affect Classification Performance

摘要

著录项

相似文献

相关主题

期刊订阅