Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Software Fault Prediction: Using Machine Learning Algorithms
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Background: Software fault prediction (SFP) is a critical task in software engineering, enabling early identification of fault modules to improve software quality and reduce maintenance costs. SFP datasets, such as PROMISE, are often characterized by high-dimensional metrics and multicollinearity, posing unique challenges. This research investigates the combined effects of feature selection and parameter tuning on the performance of machine learning models for SFP. 

Method: This study evaluates the interaction between feature selection methods, including Correlation-Based Feature Selection (CFS), Recursive Feature Elimination (RFE), Mutual Information (MI), and L1 Regularization, and hyperparameter tuning techniques such as Grid Search, Randomized Search, and Genetic Algorithm. Widely used machine learning algorithms, including Random Forest, Logistic Regression, and Support Vector Machines (SVM), are employed to optimize fault prediction performance. 

Results: The combined application of CFS and Genetic Algorithm yielded the highest accuracy, achieving 88.40% with Random Forest, representing an 18% improvement over baseline models without feature selection or tuning. Feature selection reduced dimensionality and identified critical attributes such as Weighted Methods per Class (wmc) and Coupling Between Objects (cbo), while iterative parameter tuning optimized model alignment to these feature sets. Notably, the proposed methods demonstrated robustness, with minimal cross-validation variability (±1.0%), and eff iciency, reducing training times in univariate methods like L1 Regularization. 

Conclusions: This study concludes that integrating multivariate feature selection with iterative hyperparameter tuning significantly improves the accuracy, robustness, and computational efficiency of software fault prediction models. The findings establish a framework in this research optimizing fault prediction models, such as combining CFS and Genetic Algorithms for high-stakes scenarios or Randomized Search with sparse feature selection for resource-constrained environments. These f indings bridge critical gaps in SFP optimization, offering a structured approach to achieve scalable and high-performing prediction models.

Place, publisher, year, edition, pages
2024. , p. 75
Keywords [en]
Software Fault Prediction, Machine Learning, Parameter Tuning, Feature Selection, Search Optimization
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-27169OAI: oai:DiVA.org:bth-27169DiVA, id: diva2:1916628
Subject / course
PA2534 Master's Thesis (120 credits) in Software Engineering
Educational program
PAAPT Master of Science Programme in Software Engineering
Supervisors
Available from: 2024-12-02 Created: 2024-11-28 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

fulltext(1744 kB)273 downloads
File information
File name FULLTEXT01.pdfFile size 1744 kBChecksum SHA-512
c501169202256ea83ef333dc75ef83a784ea9ed22012e81d498fa7c558c43e4d9aec1ff801ec107ce822bc52da83b24394b3e776bbf2232db58497cdb738fc7a
Type fulltextMimetype application/pdf

By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 273 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 988 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf