...
首页> 外文期刊>IEICE transactions on information and systems >Automated Duplicate Bug Report Detection Using Multi-Factor Analysis
【24h】

Automated Duplicate Bug Report Detection Using Multi-Factor Analysis

机译:使用多因素分析进行​​自动重复错误报告检测

获取原文
   

获取外文期刊封面封底 >>

       

摘要

The bug reports expressed in natural language text usually suffer from vast, ambiguous and poorly written, which causes the challenge to the duplicate bug reports detection. Current automatic duplicate bug reports detection techniques have mainly focused on textual information and ignored some useful factors. To improve the detection accuracy, in this paper, we propose a new approach calls LNG (LDA and N-gram) model which takes advantages of the topic model LDA and word-based model N-gram. The LNG considers multiple factors, including textual information, semantic correlation, word order, contextual connections, and categorial information, that potentially affect the detection accuracy. Besides, the N-gram adopted in our LNG model is improved by modifying the similarity algorithm. The experiment is conducted under more than 230,000 real bug reports of the Eclipse project. In the evaluation, we propose a new evaluation metric, namely exact-accuracy (EA) rate, which can be used to enhance the understanding of the performance of duplicates detection. The evaluation results show that all the recall rate, precision rate, and EA rate of the proposed method are higher than treating them separately. Also, the recall rate is improved by 2.96%-10.53% compared to the state-of-art approach DBTM.
机译:用自然语言文字表达的错误报告通常会遭受大量,模棱两可和编写不当的困扰,这给检测重复的错误报告带来了挑战。当前的自动重复错误报告检测技术主要集中在文本信息上,而忽略了一些有用的因素。为了提高检测精度,本文提出了一种新的方法,称为LNG(LDA和N-gram)模型,该方法利用了主题模型LDA和基于单词的模型N-gram的优势。 LNG考虑了可能影响检测准确性的多种因素,包括文本信息,语义相关性,单词顺序,上下文连接和类别信息。此外,通过修改相似性算法,改进了我们在LNG模型中采用的N-gram。该实验是在Eclipse项目的23万多个实际错误报告下进行的。在评估中,我们提出了一种新的评估指标,即精确度(EA)率,可用于增强对重复检测性能的理解。评估结果表明,该方法的召回率,查准率和EA率均高于单独处理。而且,与最新方法DBTM相比,召回率提高了2.96%-10.53%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号