Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction

机译：监督VS无监督模型：全面看看努力感知的立交缺陷预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Effort-aware just-in-time (JIT) defect prediction aims at finding more defective software changes with limited code inspection cost. Traditionally, supervised models have been used; however, they require sufficient labelled training data, which is difficult to obtain, especially for new projects. Recently, Yang et al. proposed an unsupervised model (LT) and applied it to projects with rich historical bug data. Interestingly, they reported that, under the same inspection cost (i.e., 20 percent of the total lines of code modified by all changes), it could find more defective changes than a state-of-the-art supervised model (i.e., EALR). This is surprising as supervised models that benefit from historical data are expected to perform better than unsupervised ones. Their finding suggests that previous studies on defect prediction had made a simple problem too complex. Considering the potential high impact of Yang et al.'s work, in this paper, we perform a replication study and present the following new findings: (1) Under the same inspection budget, LT requires developers to inspect a large number of changes necessitating many more context switches. (2) Although LT finds more defective changes, many highly ranked changes are false alarms. These initial false alarms may negatively impact practitioners' patience and confidence. (3) LT does not outperform EALR when the harmonic mean of Recall and Precision (i.e., F1-score) is considered. Aside from highlighting the above findings, we propose a simple but improved supervised model called CBS. When compared with EALR, CBS detects about 15% more defective changes and also significantly improves Precision and F1-score. When compared with LT, CBS achieves similar results in terms of Recall, but it significantly reduces context switches and false alarms before first success. Finally, we also discuss the implications of our findings for practitioners and researchers.

机译：努力感知的刚性（JIT）缺陷预测旨在找到更多有缺陷的软件更改，具有有限的代码检查成本。传统上，已经使用了监督模型;但是，它们需要足够的标记培训数据，这难以获得，特别是对于新项目。最近，杨等人。提出了无监督的模型（LT）并将其应用于具有丰富的历史错误数据的项目。有趣的是，他们报告说，根据相同的检查成本（即，通过所有变化修改的代码总数的20％），它可以找到比最先进的监督模型（即eAlr）的更有缺陷的变化。这令人惊讶的是，由于历史数据中受益的监督模型预计会比无监督更好。他们的发现表明，以前关于缺陷预测的研究已经使一个简单的问题太复杂了。考虑到杨等人的潜在高影响力。在本文中，我们进行复制研究并呈现以下新发现：（1）在同一检验预算下，LT要求开发人员检查大量变化需要许多上下文切换。（2）虽然LT找到了更具缺陷的变化，但许多高度排名的变化是误报。这些初始错误警报可能会影响从业者的耐心和信心。（3）当考虑召回和精度的谐波平均值（即，F1分）时，LT不胜过EALR。除了突出上述调查结果，我们提出了一种简单但改进的监督模型，称为CBS。与EALR相比，CBS检测到约15 ％的缺陷变化，并且也显着提高了精度和F1分数。与LT相比，CBS在召回方面实现了类似的结果，但在首次成功之前，它显着减少了上下文切换和误报。最后，我们还讨论了我们对从业者和研究人员的调查结果的影响。

著录项

来源
《IEEE International Conference on Software Maintenance and Evolution》|2017年|689p|共12页
会议地点
作者
Qiao Huang; Xin Xia; David Lo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.5-53;
关键词
Predictive models; Inspection; Measurement; Computer bugs; Analytical models; Feature extraction; Software;

机译：预测模型;检查;测量;计算机错误;分析模型;特征提取;软件;

相似文献

外文文献
中文文献
专利

1. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction [J] . Huang Qiao, Xia Xin, Lo David Empirical Software Engineering . 2019,第5期

机译：回顾有监督和无监督模型，以进行努力感知的及时缺陷预测
2. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction [J] . Huang Qiao, Xia Xin, Lo David Empirical Software Engineering . 2019,第5期

机译：重新审视监督和无监督模型的努力知识的缺陷预测
3. Effort-Aware semi-Supervised just-in-Time defect prediction [J] . Li Weiwei, Zhang Wenzhou, Jia Xiuyi, Information and software technology . 2020,第Octa期

机译：努力感知半监督的刚反时间缺陷预测
4. Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction [C] . Qiao Huang, Xin Xia, David Lo IEEE International Conference on Software Maintenance and Evolution . 2017

机译：监督模型与无监督模型：全面了解努力意识的及时缺陷预测
5. Unsupervised and semi-supervised training methods for eukaryotic gene prediction. [D] . Ter-Hovhannisyan, Vardges. 2008

机译：真核基因预测的无监督和半监督训练方法。
6. Effort-aware and just-in-time defect prediction with neural network [O] . Lei Qiao, Yan Wang -1

机译：利用神经网络进行工作量感知和及时的缺陷预测
7. Effort-aware just-in-time defect prediction : simple unsupervised models could be better than supervised models [O] . Yang YB, Zhou YM, Liu JP, 2016

机译：尽力而为的及时缺陷预测：简单的无监督模型可能比有监督模型更好

Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction

摘要

著录项

相似文献

相关主题

期刊订阅