Staff View: Review and empirical analysis of machine learning-based software effort estimation

Review and empirical analysis of machine learning-based software effort estimation

The average software company spends a huge amount of its revenue on Research and Development (R&D) for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable softwar...

Full description

Saved in:

Bibliographic Details
Main Authors:	Rahman, Mizanur, Sarwar, Hasan, Abdul Kader, Md, Goncalves, Teresa C.F., Ting, Tin Tin
Format:	Article
Language:	English
Published:	IEEE 2024
Subjects:	QA75 Electronic computers. Computer science QA76 Computer software TA Engineering (General). Civil engineering (General)
Online Access:	http://umpir.ump.edu.my/id/eprint/42958/1/Review%20and%20empirical%20analysis%20of%20machine%20learning-based%20software%20effort%20estimation.pdf http://umpir.ump.edu.my/id/eprint/42958/ https://doi.org/10.1109/ACCESS.2024.3404879 https://doi.org/10.1109/ACCESS.2024.3404879
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.ump.umpir.42958
record_format	eprints
spelling	my.ump.umpir.429582025-01-08T01:43:21Z http://umpir.ump.edu.my/id/eprint/42958/ Review and empirical analysis of machine learning-based software effort estimation Rahman, Mizanur Sarwar, Hasan Abdul Kader, Md Goncalves, Teresa C.F. Ting, Tin Tin QA75 Electronic computers. Computer science QA76 Computer software TA Engineering (General). Civil engineering (General) The average software company spends a huge amount of its revenue on Research and Development (R&D) for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, both overestimation and underestimation can pose significant challenges, highlighting the need for continuous improvement in estimation techniques. This study reviews recent machine learning approaches employed to enhance the accuracy of software effort estimation (SEE), focusing on research published between 2020 and 2023. The literature review employed a systematic approach to identify relevant research on machine learning techniques for SEE. Additionally, comparative experiments were conducted using five commonly employed Machine Learning (ML) methods: K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and LASSO Regression. The performance of these techniques was evaluated using five widely adopted accuracy metrics: Mean Squared Error (MSE), Mean Magnitude of Relative Error (MMRE), R-squared, Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). The evaluation was carried out on seven benchmark datasets: Albrecht, Desharnais, China, Kemerer, Mayazaki94, Maxwell, and COCOMO, which are publicly available and extensively used in SEE research. By carefully reviewing study quality, analyzing results across the literature, and rigorously evaluating experimental outcomes, clear conclusions were drawn about the most promising techniques for achieving state-of-the-art accuracy in estimating software effort. This study makes three key contributions to the field: firstly, it furnishes a thorough overview of recent machine learning research in software effort estimation (SEE); secondly, it provides data-driven guidance for researchers and practitioners to select optimal methods for accurate effort estimation; and thirdly, it demonstrates the performance of publicly available datasets through experimental analysis. Enhanced estimation supports the development of better predictive models for software project time, cost, and staffing needs. The findings aim to guide future research directions and tool development toward the most accurate machine learning approaches for modelling software development effort, costs, and delivery schedules, ultimately contributing to more efficient and cost-effective software projects. IEEE 2024 Article PeerReviewed pdf en cc_by_nc_nd_4 http://umpir.ump.edu.my/id/eprint/42958/1/Review%20and%20empirical%20analysis%20of%20machine%20learning-based%20software%20effort%20estimation.pdf Rahman, Mizanur and Sarwar, Hasan and Abdul Kader, Md and Goncalves, Teresa C.F. and Ting, Tin Tin (2024) Review and empirical analysis of machine learning-based software effort estimation. IEEE Access, 12. pp. 85661-85680. ISSN 2169-3536. (Published) https://doi.org/10.1109/ACCESS.2024.3404879 https://doi.org/10.1109/ACCESS.2024.3404879
institution	Universiti Malaysia Pahang Al-Sultan Abdullah
building	UMPSA Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang Al-Sultan Abdullah
content_source	UMPSA Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	QA75 Electronic computers. Computer science QA76 Computer software TA Engineering (General). Civil engineering (General)
spellingShingle	QA75 Electronic computers. Computer science QA76 Computer software TA Engineering (General). Civil engineering (General) Rahman, Mizanur Sarwar, Hasan Abdul Kader, Md Goncalves, Teresa C.F. Ting, Tin Tin Review and empirical analysis of machine learning-based software effort estimation
description	The average software company spends a huge amount of its revenue on Research and Development (R&D) for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, both overestimation and underestimation can pose significant challenges, highlighting the need for continuous improvement in estimation techniques. This study reviews recent machine learning approaches employed to enhance the accuracy of software effort estimation (SEE), focusing on research published between 2020 and 2023. The literature review employed a systematic approach to identify relevant research on machine learning techniques for SEE. Additionally, comparative experiments were conducted using five commonly employed Machine Learning (ML) methods: K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and LASSO Regression. The performance of these techniques was evaluated using five widely adopted accuracy metrics: Mean Squared Error (MSE), Mean Magnitude of Relative Error (MMRE), R-squared, Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). The evaluation was carried out on seven benchmark datasets: Albrecht, Desharnais, China, Kemerer, Mayazaki94, Maxwell, and COCOMO, which are publicly available and extensively used in SEE research. By carefully reviewing study quality, analyzing results across the literature, and rigorously evaluating experimental outcomes, clear conclusions were drawn about the most promising techniques for achieving state-of-the-art accuracy in estimating software effort. This study makes three key contributions to the field: firstly, it furnishes a thorough overview of recent machine learning research in software effort estimation (SEE); secondly, it provides data-driven guidance for researchers and practitioners to select optimal methods for accurate effort estimation; and thirdly, it demonstrates the performance of publicly available datasets through experimental analysis. Enhanced estimation supports the development of better predictive models for software project time, cost, and staffing needs. The findings aim to guide future research directions and tool development toward the most accurate machine learning approaches for modelling software development effort, costs, and delivery schedules, ultimately contributing to more efficient and cost-effective software projects.
format	Article
author	Rahman, Mizanur Sarwar, Hasan Abdul Kader, Md Goncalves, Teresa C.F. Ting, Tin Tin
author_facet	Rahman, Mizanur Sarwar, Hasan Abdul Kader, Md Goncalves, Teresa C.F. Ting, Tin Tin
author_sort	Rahman, Mizanur
title	Review and empirical analysis of machine learning-based software effort estimation
title_short	Review and empirical analysis of machine learning-based software effort estimation
title_full	Review and empirical analysis of machine learning-based software effort estimation
title_fullStr	Review and empirical analysis of machine learning-based software effort estimation
title_full_unstemmed	Review and empirical analysis of machine learning-based software effort estimation
title_sort	review and empirical analysis of machine learning-based software effort estimation
publisher	IEEE
publishDate	2024
url	http://umpir.ump.edu.my/id/eprint/42958/1/Review%20and%20empirical%20analysis%20of%20machine%20learning-based%20software%20effort%20estimation.pdf http://umpir.ump.edu.my/id/eprint/42958/ https://doi.org/10.1109/ACCESS.2024.3404879 https://doi.org/10.1109/ACCESS.2024.3404879
_version_	1822924897077166080
score	13.235362

Review and empirical analysis of machine learning-based software effort estimation

Similar Items