首页> 外文会议>International Conference on Web Research >Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches

【24h】

Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches

机译：使用基于信息检索的方法与基于机器学习的方法进行自动重复错误报告检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays, there are many software repositories, especially on the web, which have many challenges to be automated. Duplicate bug report detection (DBRD) is an excellent problem of software triage systems like Bugzilla since 2004 as an essential online software repository. There are two main approaches for automatic DBRD, including information retrieval (IR)-based and machine learning (ML)-based. Many related works are using both approaches, but it is not clear which one is more useful and has better performance. This study focuses on introducing a methodology for comparing the validation performance of both approaches in a particular condition. The Android dataset is used for evaluation, and about 2 million pairs of bug reports are analyzed for 59 bug reports, which were duplicate. The results show that the ML-based approach has better validation performance, incredibly about 40%. Besides, the ML-based approach has a more reliable criterion for evaluation like accuracy, precision, and recall versus an IR-based approach, which has just mean average precision (MAP) or rank metrics.

机译：如今，有许多软件存储库，尤其是在网络上，存在许多需要自动化的挑战。自2004年以来，重复错误报告检测（DBRD）是像Bugzilla这样的软件分类系统的一个重要问题，它是必不可少的在线软件存储库。自动DBRD有两种主要方法，包括基于信息检索（IR）和基于机器学习（ML）的方法。许多相关的工作都在使用这两种方法，但是尚不清楚哪种方法更有用且性能更好。这项研究的重点是介绍一种在特定条件下比较两种方法的验证性能的方法。使用Android数据集进行评估，并分析了大约200万对错误报告，以查找59个重复的错误报告。结果表明，基于ML的方法具有更好的验证性能，令人难以置信的约为40％。此外，与基于IR的方法相比，基于ML的方法具有更可靠的评估标准，例如准确性，准确性和召回率，而IR的方法仅具有平均精度（MAP）或等级度量。

著录项

来源
《International Conference on Web Research》|2020年|288-293|共6页
会议地点
作者
Behzad Soleimani Neysiani; Seyed Morteza Babamir;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Duplicate Detection; Bug Report; Information Retrieval; Machine Learning;

机译：重复检测;错误报告;信息检索;机器学习;

相似文献

外文文献
中文文献
专利

1. An HMM-based approach for automatic detection and classification of duplicate bug reports [J] . Ebrahimi Neda, Trabelsi Abdelaziz, Islam Md Shariful, Information and software technology . 2019,第SEPa期

机译：基于HMM的方法，用于自动检测和分类重复的错误报告
2. An HMM-based approach for automatic detection and classification of duplicate bug reports [J] . Ebrahimi Neda, Trabelsi Abdelaziz, Islam Md Shariful, Information and software technology . 2019,第Sepa期

机译：基于HMM的自动检测方法和分类重复错误报告
3. On the relationship between bug reports and queries for text retrieval-based bug localization [J] . Chris Mills, Esteban Parra, Jevgenija Pantiuchina, Empirical Software Engineering . 2020,第5期

机译：关于基于文本检索的错误本地化的错误报告与查询的关系
4. Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches [C] . Behzad Soleimani Neysiani, Seyed Morteza Babamir International Conference on Web Research . 2020

机译：使用基于信息检索的基于机器学习的方法自动复制错误报告检测
5. A contextual approach towards more accurate duplicate bug report detection. [D] . Alipour, Anahita. 2013

机译：一种用于更准确地检测重复错误报告的上下文方法。
6. Robust Machine Learning-Based Correction on Automatic Segmentation of the Cerebellum and Brainstem [O] . Jun Yi Wang, Michael M. Ngo, David Hessl, -1

机译：基于鲁棒机器学习的小脑和脑干自动分割校正
7. An HMM-based approach for automatic detection and classification of duplicate bug reports [O] . Neda Ebrahimi, Abdelaziz Trabelsi, Md. Shariful Islam, 2019

机译：基于HMM的自动检测方法和分类重复错误报告

Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches

摘要

著录项

相似文献

相关主题

期刊订阅