Detecting code smells using industry-relevant data
2023 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 155, article id 107112Article in journal (Refereed) Published
Abstract [en]
Context: Code smells are patterns in source code associated with an increased defect rate and a higher maintenance effort than usual, but without a clear definition. Code smells are often detected using rules hard-coded in detection tools. Such rules are often set arbitrarily or derived from data sets tagged by reviewers without the necessary industrial know-how. Conclusions from studying such data sets may be unreliable or even harmful, since algorithms may achieve higher values of performance metrics on them than on models tagged by experts, despite not being industrially useful. Objective: Our goal is to investigate the performance of various machine learning algorithms for automated code smell detection trained on code smell data set(MLCQ) derived from actively developed and industry-relevant projects and reviews performed by experienced software developers. Method: We assign the severity of the smell to the code sample according to a consensus between the severities assigned by the reviewers, use the Matthews Correlation Coefficient (MCC) as our main performance metric to account for the entire confusion matrix, and compare the median value to account for non-normal distributions of performance. We compare 6720 models built using eight machine learning techniques. The entire process is automated and reproducible. Results: Performance of compared techniques depends heavily on analyzed smell. The median value of our performance metric for the best algorithm was 0.81 for Long Method, 0.31 for Feature Envy, 0.51 for Blob, and 0.57 for Data Class. Conclusions: Random Forest and Flexible Discriminant Analysis performed the best overall, but in most cases the performance difference between them and the median algorithm was no more than 10% of the latter. The performance results were stable over multiple iterations. Although the F-score omits one quadrant of the confusion matrix (and thus may differ from MCC), in code smell detection, the actual differences are minimal. © 2022 Elsevier B.V.
Place, publisher, year, edition, pages
Elsevier, 2023. Vol. 155, article id 107112
Keywords [en]
Codes (symbols), Forestry, Image resolution, Learning algorithms, Learning systems, Matrix algebra, Normal distribution, Software engineering, Supervised learning, Technology transfer, Code smell, Confusion matrix, Correlation coefficient, Data set, Machine-learning, Median value, Performance, Performance metrices, Reproducible research, Source codes, Discriminant analysis, Code smells, Machine learning
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-24096DOI: 10.1016/j.infsof.2022.107112ISI: 000901826200018Scopus ID: 2-s2.0-85143501730OAI: oai:DiVA.org:bth-24096DiVA, id: diva2:1719758
2022-12-162022-12-162023-01-26Bibliographically approved