Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A comparative analysis of ML techniques for bug report classification
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0002-5964-5554
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0001-7266-5632
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0003-0639-4234
Lund University.ORCID iD: 0000–0001–6736–9425
2025 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 227, article id 112457Article in journal (Refereed) Published
Abstract [en]

Several studies have evaluated various ML techniques and found promising results in classifying bug reports. However, these studies have used different evaluation designs, making it difficult to compare their results. Furthermore, they have focused primarily on accuracy and did not consider other potentially relevant factors such as generalizability, explainability, and maintenance cost. These two aspects make it difficult for practitioners and researchers to choose an appropriate ML technique for a given context. Therefore, we compare promising ML techniques against practitioners’ concerns using evaluation criteria that go beyond accuracy. Based on an existing framework for adopting ML techniques, we developed an evaluation framework for ML techniques for bug report classification. We used this framework to compare nine ML techniques on three datasets. The results enable a tradeoff analysis between various promising ML techniques. The results show that an ML technique with the highest predictive accuracy might not be the most suitable technique for some contexts. The overall approach presented in the paper supports making informed decisions when choosing ML techniques. It is not locked to the specific techniques, datasets, or factors we have selected here, and others could easily use and adapt it for additional techniques or concerns. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.

Place, publisher, year, edition, pages
Elsevier, 2025. Vol. 227, article id 112457
Keywords [en]
Software Maintenance, Issue Classification, Bug Report Classification, Natural Language Processing, BERT, RoBERTa, Large Language Models, Automated Machine Learning, AutoML, Software Analytics
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
URN: urn:nbn:se:bth-27193DOI: 10.1016/j.jss.2025.112457ISI: 001481117700001Scopus ID: 2-s2.0-105003372247OAI: oai:DiVA.org:bth-27193DiVA, id: diva2:1917722
Part of project
GIST – Gaining actionable Insights from Software Testing, Knowledge Foundation
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235Available from: 2024-12-03 Created: 2024-12-03 Last updated: 2025-09-30Bibliographically approved
In thesis
1. Software Analytics for Supporting Practitioners in Bug Management
Open this publication in new window or tab >>Software Analytics for Supporting Practitioners in Bug Management
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Context: In large-scale software development, a large number of bug reports are submitted during software development and maintenance. Practitioners need the ability to analyze this abundant data to make data-driven decisions about bug management tasks.

Objective: This thesis aims to utilize software analytics (SA) to support practitioners in bug management. The objectives of this thesis are (1) to identify and structure the knowledge on the use of SA for software engineering (SE) tasks and (2) to investigate and evaluate the practical application of SA to support practitioners in managing invalid bug reports (IBRs).

Method: We conducted a tertiary review and systematic mapping study to achieve the first objective and comparative experiments and two industrial case studies to achieve the second objective. Throughout the thesis work, we relied on a technology transfer model to guide the research and facilitate the adoption of ML techniques for the early identification of IBRs at the case company.

Results: We provide a comprehensive map of various SA applications for SE tasks and a decision matrix that can assist in selecting the most appropriate ML technique for bug report classification for a given context. Our results indicate that an ML technique can identify IBRs with acceptable accuracy at an early stage in practice. Furthermore, the results of an SA-based approach indicate that it can support practitioners in devising preventive measures for IBRs.

Conclusion: Through industrial validations, this thesis provides evidence of the usefulness of SA in bug management, particularly in supporting practitioners in managing IBRs in large-scale software development.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2025. p. 231
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:02
Keywords
Issue management, Bug reports, Invalid bug reports, Software analytics, Machine learning, AutoML, Large language models
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-27197 (URN)978-91-7295-494-6 (ISBN)
Public defence
2025-02-13, J1630, Campus Gräsvik, Karlskrona, 13:15 (English)
Opponent
Supervisors
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Available from: 2024-12-04 Created: 2024-12-03 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

fulltext(1979 kB)44 downloads
File information
File name FULLTEXT01.pdfFile size 1979 kBChecksum SHA-512
e049feb572bf0e5a1d32fa7c8d65a1ca8dd34c95b84ad03400b81f03fdbfd3c1f1d971b8feb9fe25127f11b21a7df95631f990d3477a09fef3b4d44924d8fe8c
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Laiq, MuhammadAli, Nauman binBörstler, Jürgen

Search in DiVA

By author/editor
Laiq, MuhammadAli, Nauman binBörstler, JürgenEngström, Emelie
By organisation
Department of Software Engineering
In the same journal
Journal of Systems and Software
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 44 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 286 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf