Investigating the use of moving windows to improve software effort prediction: a replicated study
2016 (English)In: Journal of Empirical Software Engineering, ISSN 1382-3256, E-ISSN 1573-7616, 1-52 p.Article in journal (Refereed) Epub ahead of print
To date most research in software effort estimation has not taken chronology into account when selecting projects for training and validation sets. A chronological split represents the use of a projectâs starting and completion dates, such that any model that estimates effort for a new project p only uses as its training set projects that have been completed prior to pâs starting date. A study in 2009 (âS3â) investigated the use of chronological split taking into account a projectâs age. The research question investigated was whether the use of a training set containing only the most recent past projects (a âmoving windowâ of recent projects) would lead to more accurate estimates when compared to using the entire history of past projects completed prior to the starting date of a new project. S3 found that moving windows could improve the accuracy of estimates. The study described herein replicates S3 using three different and independent data sets. Estimation models were built using regression, and accuracy was measured using absolute residuals. The results contradict S3, as they do not show any gain in estimation accuracy when using windows for effort estimation. This is a surprising result: the intuition that recent data should be more helpful than old data for effort estimation is not supported. Several factors, which are discussed in this paper, might have contributed to such contradicting results. Some of our future work entails replicating this work using other datasets, to understand better when using windows is a suitable choice for software companies.
Place, publisher, year, edition, pages
Springer-Verlag New York, 2016. 1-52 p.
Computer software, Chronological splits; Chronological splitting; Effort Estimation; Moving window; Regression based estimation models; Research questions; Software effort estimation; Software effort prediction, Software engineering
IdentifiersURN: urn:nbn:se:bth-13084DOI: 10.1007/s10664-016-9446-4ScopusID: 2-s2.0-84983399519OAI: oai:DiVA.org:bth-13084DiVA: diva2:1033185