MODEL-BASED SOFTWARE EFFORT ESTIMATION - A ROBUST COMPARISON OF 14 ALGORITHMS WIDELY USED IN THE DATA SCIENCE COMMUNITY

PASSAKORN PHANNACHITTA; KENICHI MATSUMOTO

首页> 外文期刊>International Journal of Innovative Computing Information and Control >MODEL-BASED SOFTWARE EFFORT ESTIMATION - A ROBUST COMPARISON OF 14 ALGORITHMS WIDELY USED IN THE DATA SCIENCE COMMUNITY

【24h】

MODEL-BASED SOFTWARE EFFORT ESTIMATION - A ROBUST COMPARISON OF 14 ALGORITHMS WIDELY USED IN THE DATA SCIENCE COMMUNITY

机译：基于模型的软件努力估算 - 一种稳健的比较数据科学界广泛使用的14种算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The emergence of the data science discipline has facilitated the development of novel and advanced machine-learning algorithms for tackling tasks related to data analytics. For example, ensemble learning and deep learning have frequently achieved promising results in many recent data-science competitions, such as those hosted by Kaggle. However, these algorithms have not yet been thoroughly assessed on their performance when applied to software effort estimation. In this study, an assessment framework known as a stable-ranking-indication method is adopted to compare 14 machine-learning algorithms widely adopted in the data science communities. The comparisons were carried out over 13 industrial datasets, subject to six robust and independent performance metrics, and supported by the Brunner statistical test method. The results of this study proved to be stable because similar machine-learning algorithms achieved similar performance results; particularly, random forest and bagging performed the best among the compared algorithms. The results further offered evidence that demonstrated how to build an effective stacked ensemble. In other words, the optimal approach to maximizing the overall expected performance of the stacked ensemble can be derived through a balanced trade-off between maximizing the expected accuracy by selecting only the solo algorithms that are most likely to perform outstandingly on the dataset, and maximizing the level of diversity of the algorithms. Precisely, the stack combining bagging, random forests, analogy-based estimation, adaBoost, the gradient boosting machine, and ordinary least squares regression was shown to be the optimal stack for the software effort estimation datasets.

机译：数据科学纪律的出现促进了用于解决与数据分析相关的任务的新颖和先进的机器学习算法的开发。例如，在许多最近的数据科学竞赛中经常实现有前途的学习和深度学习，例如由卡格托管的人。但是，在应用于软件努力估计时，这些算法尚未对其性能进行全面评估。在本研究中，采用称为稳定排名指示方法的评估框架比较数据科学社区广泛采用的14种机器学习算法。比较在13个工业数据集中进行，受到六个强大和独立的性能指标，并由Brunner统计测试方法支持。该研究的结果证明是稳定的，因为类似的机器学习算法取得了类似的性能结果;特别是，随机森林和袋装在比较算法中表现了最佳。结果进一步提供了证据证明如何构建有效的堆叠集合。换句话说，最大化堆叠集合的整体预期性能的最佳方法可以通过在最大化预期的准确性来通过选择最有可能在数据集上突出的独奏算法来实现预期的准确性之间的平衡权衡来导出，并最大化算法的多样性。精确地，堆栈组合袋装，随机森林，基于类比的估计，adaboost，梯度升压机和普通最小二乘回归被显示为软件工作估计数据集的最佳堆栈。

著录项

来源
《International Journal of Innovative Computing Information and Control》 |2019年第2期|569-589|共21页
作者
PASSAKORN PHANNACHITTA; KENICHI MATSUMOTO;
展开▼
作者单位

College of Arts Media and Technology Chiang Mai University 239 Suthep Muang Chiang Mai 50200 Thailand;

Graduate School of Science and Technology Nara Institute of Science and Technology 8916-5 Takayama Ikoma Nara 630-0192 Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Software effort estimation; Data science; Kaggle; Robust statistics; Empirical software engineering;

机译：软件努力估算;数据科学;卡格;强大的统计;经验软件工程;

相似文献

外文文献
中文文献
专利

1. Software effort estimation: Harmonizing algorithms and domain knowledge in an integrated data mining approach [J] . Deng J.D., Purvis M.K., Purvis M.A. International Journal of Intelligent Information Technologies . 2011,第3期

机译：软件工作量估算：以集成数据挖掘方法协调算法和领域知识
2. Intelligent Software Effort Estimation through a Multiple Comparisons Algorithm [J] . R.Manimegalai, J.Selvakumar, M.Rajaram International Journal of Innovative Research in Science, Engineering and Technology . 2014,第1期

机译：通过多重比较算法进行智能软件工作量估计
3. Detection of Aberrant Data Points for an effective Effort Estimation using an Enhanced Algorithm with Adaptive Features | Science Publications [J] . S. Malathi, S. Sridhar Journal of computer sciences . 2011,第2期

机译：使用具有自适应功能的增强算法检测异常数据点以进行有效的工作量估算科学出版物
4. Robust comparison of similarity measures in analogy based software effort estimation [C] . Passakorn Phannachitta International Conference on Software, Knowledge Information Management and Applications . 2017

机译：在基于类比的软件工作量估算中，对相似性度量进行稳健的比较
5. Software effort estimation accuracy: A comparative study of estimations based on software sizing and development methods. [D] . Lafferty, Mark T. 2010

机译：软件工作量估计准确性：基于软件大小和开发方法的估计的比较研究。
6. An Accurate FFPA-PSR Estimator Algorithm and Tool for Software Effort Estimation [O] . Senthil Kumar Murugesan, Chidhambara Rajan Balasubramanian 2015

机译：用于软件工作量估算的精确FFPA-PSR估算器算法和工具
7. Comparison and evaluation of data mining techniques with algorithmic models in software cost estimation [O] . Khalifelu Zeynab Abbasi, Gharehchopogh Farhad Soleimanian 2012

机译：软件成本估算中数据挖掘技术与算法模型的比较与评估

MODEL-BASED SOFTWARE EFFORT ESTIMATION - A ROBUST COMPARISON OF 14 ALGORITHMS WIDELY USED IN THE DATA SCIENCE COMMUNITY

摘要

著录项

相似文献

相关主题

期刊订阅