Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A comparison of machine learning algorithms on design smell detection using balanced and imbalanced dataset: A study of God class
Al-Balqa Applied University, JOR.
Uppsala University, SWE.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-4071-4596
Universidad de Valladolid, ESP.
Show others and affiliations
2022 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 143, article id 106736Article in journal (Refereed) Published
Abstract [en]

Context: Design smell detection has proven to be a significant activity that has an aim of not only enhancing the software quality but also increasing its life cycle. Objective: This work investigates whether machine learning approaches can effectively be leveraged for software design smell detection. Additionally, this paper provides a comparatively study, focused on using balanced datasets, where it checks if avoiding dataset balancing can be of any influence on the accuracy and behavior during design smell detection. Method: A set of experiments have been conducted-using 28 Machine Learning classifiers aimed at detecting God classes. This experiment was conducted using a dataset formed from 12,587 classes of 24 software systems, in which 1,958 classes were manually validated. Results: Ultimately, most classifiers obtained high performances,-with Cat Boost showing a higher performance. Also, it is evident from the experiments conducted that data balancing does not have any significant influence on the accuracy of detection. This reinforces the application of machine learning in real scenarios where the data is usually imbalanced by the inherent nature of design smells. Conclusions: Machine learning approaches can effectively be used as a leverage for God class detection. While in this paper we have employed SMOTE technique for data balancing, it is worth noting that there exist other methods of data balancing and with other design smells. Furthermore, it is also important to note that application of those other methods may improve the results, in our experiments SMOTE did not improve God class detection. The results are not fully generalizable because only one design smell is studied with projects developed in a single programming language, and only one balancing technique is used to compare with the imbalanced case. But these results are promising for the application in real design smells detection scenarios as mentioned above and the focus on other measures, such as Kappa, ROC, and MCC, have been used in the assessment of the classifier behavior. © 2021 The Authors

Place, publisher, year, edition, pages
Elsevier B.V. , 2022. Vol. 143, article id 106736
Keywords [en]
Balanced data, Design smell detection, God class, Machine learning, Software quality, Balancing, Computer software selection and evaluation, Learning algorithms, Life cycle, Odors, Software design, Balanced datasets, Context design, Imbalanced dataset, Machine learning algorithms, Machine learning approaches, Performance, Software-systems
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-22375DOI: 10.1016/j.infsof.2021.106736ISI: 000720570800002Scopus ID: 2-s2.0-85118508099OAI: oai:DiVA.org:bth-22375DiVA, id: diva2:1612698
Note

open access

Available from: 2021-11-19 Created: 2021-11-19 Last updated: 2021-12-03Bibliographically approved

Open Access in DiVA

fulltext(2733 kB)225 downloads
File information
File name FULLTEXT01.pdfFile size 2733 kBChecksum SHA-512
8435898fcbf944511be492d90af03602a34a47d106c5da20aea0f9bee648d27c438dd9544b968f2bcb2852f4f70e464754be4140e6f7de3e6e21c43acb027beb
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Kebande, Victor R.

Search in DiVA

By author/editor
Kebande, Victor R.
By organisation
Department of Computer Science
In the same journal
Information and Software Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 234 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 411 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf