Change search
ExportLink to record
Permanent link

Direct link
BETA

Project

Project type/Form of grant
Title [en]
GIST – Gaining actionable Insights from Software Testing
Abstract [sv]
Inom mjukvaruutveckling har data och visuell ”analytics” föreslagits för att tolka stora mängder flerdimensionella data.Inom mjukvarutestning är ansatsen dock inte undersökt väl. Dessutom är kostnaden för existerande ansatser för hög för praktisk användning på grund av följande begränsningar: (a) Täta kopplingar mellan analyserna och de underliggande datainsamlingsmekanismer och (b) behovet att interagera med och integrera olika datakällor för att samla in alla nödvändiga uppgifter.GIST kommer att förse företag med verktyg för att stödja kritiska operativa och strategiska beslut. På samma sätt kommer GIST att gynna forskningen eftersom de validerade verktygen kan generaliseras och anpassas till andra områden. Den skiktade modulära arkitekturen i GIST stödjer återanvändning, förbättrar datatillgängligheten och ökar hastigheten för ytterligare framsteg inom mjukvaruanalys genom att möjliggöra oberoende bidrag på olika nivåer i GISTs arkitektur.Vårt team utgör en idealisk kombination för att framgångsrikt genomföra GIST. Våra företagspartners från Axis, Ericsson och TestScouts har en lång erfarenhet av mjukvarutestning. Deras domänkunskap och förtrogenhet med forskningssamarbeten gynnar projektet. På den akademiska sidan har vi en stor erfarenhet med industriella samarbeten, mjukvaruanalys, empirisk forskning och mjukvarutestning. Vi har även relevant erfarenhet av data warehousing och mjukvaruutveckling.Medan GIST fokuserar på mjukvarutestning, kan lösningarna även generaliseras till andra kunskapsområden inom mjukvaruteknik. Det kommer därmed att skapa synergier med pågående lokala och nationella forskningsprojekt. Erfarenheterna från projektet kommer också att integreras kontinuerligt i vår undervisning och leda till examensarbeten om relaterade ämnen.I GIST utvecklar och utvärderar vi verktyg för att hantera dessa begränsningar. Med hjälp av designvetenskap identifierar och konsoliderar vi informationsbehoven hos nyckelintressenter och utvecklar analytiska lösningar för att möta dessa behov. Vi tar sedan fram en informationsmodell för att frikoppla de analytiska lösningarna från datakällorna och utvecklar en effektiv och återanvändbar datainsamlingsmekanism.
Publications (10 of 11) Show all publications
Laiq, M., Ali, N. b., Börstler, J. & Engström, E. (2025). A comparative analysis of ML techniques for bug report classification. Journal of Systems and Software, 227, Article ID 112457.
Open this publication in new window or tab >>A comparative analysis of ML techniques for bug report classification
2025 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 227, article id 112457Article in journal (Refereed) Published
Abstract [en]

Several studies have evaluated various ML techniques and found promising results in classifying bug reports. However, these studies have used different evaluation designs, making it difficult to compare their results. Furthermore, they have focused primarily on accuracy and did not consider other potentially relevant factors such as generalizability, explainability, and maintenance cost. These two aspects make it difficult for practitioners and researchers to choose an appropriate ML technique for a given context. Therefore, we compare promising ML techniques against practitioners’ concerns using evaluation criteria that go beyond accuracy. Based on an existing framework for adopting ML techniques, we developed an evaluation framework for ML techniques for bug report classification. We used this framework to compare nine ML techniques on three datasets. The results enable a tradeoff analysis between various promising ML techniques. The results show that an ML technique with the highest predictive accuracy might not be the most suitable technique for some contexts. The overall approach presented in the paper supports making informed decisions when choosing ML techniques. It is not locked to the specific techniques, datasets, or factors we have selected here, and others could easily use and adapt it for additional techniques or concerns. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Software Maintenance, Issue Classification, Bug Report Classification, Natural Language Processing, BERT, RoBERTa, Large Language Models, Automated Machine Learning, AutoML, Software Analytics
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-27193 (URN)10.1016/j.jss.2025.112457 (DOI)001481117700001 ()2-s2.0-105003372247 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235
Available from: 2024-12-03 Created: 2024-12-03 Last updated: 2025-05-23Bibliographically approved
Tran, H. K., Ali, N. b., Unterkalmsteiner, M. & Börstler, J. (2025). A proposal and assessment of an improved heuristic for the Eager Test smell detection. Journal of Systems and Software, 226, Article ID 112438.
Open this publication in new window or tab >>A proposal and assessment of an improved heuristic for the Eager Test smell detection
2025 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 226, article id 112438Article in journal (Refereed) Published
Abstract [en]

Context: The evidence for the prevalence of test smells at the unit testing level has relied on the accuracy of detection tools, which have seen intense research in the last two decades. The Eager Test smell, one of the most prevalent, is often identified using simplified detection rules that practitioners find inadequate.

Objective: We aim to improve the rules for detecting the Eager Test smell.

Method: We reviewed the literature on test smells to analyze the definitions and detection rules of the Eager Test smell. We proposed a novel, unambiguous definition of the test smell and a heuristic to address the limitations of the existing rules. We evaluated our heuristic against existing detection rules by manually applying it to 300 unit test cases in Java.

Results: Our review identified 56 relevant studies. We found that inadequate interpretations of original definitions of the Eager Test smell led to imprecise detection rules, resulting in a high level of disagreement in detection outcomes. Also, our heuristic detected patterns of eager and non-eager tests that existing rules missed.

Conclusion: Our heuristic captures the essence of the Eager Test smell more precisely; hence, it may address practitioners’ concerns regarding the adequacy of existing detection rules.

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Software testing, Test case quality, Test suite quality, Quality assurance, Test smells, Unit testing, Eager test Java JUnit
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-27675 (URN)10.1016/j.jss.2025.112438 (DOI)001464187400001 ()2-s2.0-105001808870 (Scopus ID)
Available from: 2025-03-31 Created: 2025-03-31 Last updated: 2025-04-25Bibliographically approved
Thode, L., Iftikhar, U. & Mendez, D. (2025). Exploring the use of LLMs for the selection phase in systematic literature studies. Information and Software Technology, 184, Article ID 107757.
Open this publication in new window or tab >>Exploring the use of LLMs for the selection phase in systematic literature studies
2025 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 184, article id 107757Article in journal (Refereed) Published
Abstract [en]

Context: Systematic literature studies, such as secondary studies, are crucial to aggregate evidence. An essential part of these studies is the selection phase of relevant studies. This, however, is time-consuming, resource-intensive, and error-prone as it highly depends on manual labor and domain expertise. The increasing popularity of Large Language Models (LLMs) raises the question to what extent these manual study selection tasks could be supported in an automated manner.

Objectives: In this manuscript, we report on our effort to explore and evaluate the use of state-of-the-art LLMs to automate the selection phase in systematic literature studies.

Method: We evaluated LLMs for the selection phase using two published systematic literature studies in software engineering as ground truth. Three prompts were designed and applied across five LLMs to the studies’ titles and abstracts based on their inclusion and exclusion criteria. Additionally, we analyzed combining two LLMs to replicate a practical selection phase. We analyzed recall and precision and reflected upon the accuracy of the LLMs, and whether the ground truth studies were conducted by early career scholars or by more advanced ones.

Results: Our results show a high average recall of up to 98% combined with a precision of 27% in a single LLM approach and an average recall of 99% with a precision of 27% in a two-model approach replicating a two-reviewer procedure. Further the Llama 2 models showed the highest average recall 98% across all prompt templates and datasets while GPT4-turbo had the highest average precision 72%.

Conclusions: Our results demonstrate how LLMs could support a selection phase in the future. We recommend a two LLM-approach to archive a higher recall. However, we also critically reflect upon how further studies are required using other models and prompts on more datasets to strengthen the confidence in our presented approach. © 2025 The Authors

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Automation, Large language models, Systematic literature studies
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-27884 (URN)10.1016/j.infsof.2025.107757 (DOI)001491965200001 ()2-s2.0-105004904751 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20180010Knowledge Foundation, 20220235
Available from: 2025-05-23 Created: 2025-05-23 Last updated: 2025-06-02Bibliographically approved
Tran, H. K., Ali, N. b., Unterkalmsteiner, M., Börstler, J. & Chatzipetrou, P. (2025). Quality attributes of test cases and test suites - importance & challenges from practitioners' perspectives. Software quality journal, 33(1), Article ID 9.
Open this publication in new window or tab >>Quality attributes of test cases and test suites - importance & challenges from practitioners' perspectives
Show others...
2025 (English)In: Software quality journal, ISSN 0963-9314, E-ISSN 1573-1367, Vol. 33, no 1, article id 9Article in journal (Refereed) Published
Abstract [en]

The quality of the test suites and the constituent test cases significantly impacts confidence in software testing. While research has identified several quality attributes of test cases and test suites, there is a need for a better understanding of their relative importance in practice. We investigate practitioners' perceptions regarding the relative importance of quality attributes of test cases and test suites and the challenges that they face in ensuring the perceived important quality attributes. To capture the practitioners' perceptions, we conducted an industrial survey using a questionnaire based on the quality attributes identified in an extensive literature review. We used a sampling strategy that leverages LinkedIn to draw a large and heterogeneous sample of professionals with experience in software testing. We collected 354 responses from practitioners with a wide range of experience (from less than one year to 42 years of experience). We found that the majority of practitioners rated Fault Detection, Usability, Maintainability, Reliability, and Coverage to be the most important quality attributes. Resource Efficiency, Reusability, and Simplicity received the most divergent opinions, which, according to our analysis, depend on the software-testing contexts. Also, we identified common challenges that apply to the important attributes, namely inadequate definition, lack of useful metrics, lack of an established review process, and lack of external support. The findings point out where practitioners actually need further support with respect to achieving high-quality test cases and test suites under different software testing contexts. Hence, the findings can serve as a guideline for academic researchers when looking for research directions on the topic. Furthermore, the findings can be used to encourage companies to provide more support to practitioners to achieve high-quality test cases and test suites.

Place, publisher, year, edition, pages
Springer, 2025
Keywords
Software testing, Test case quality, Test suite quality, Quality assurance
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-27395 (URN)10.1007/s11219-024-09698-w (DOI)001396622900001 ()2-s2.0-85217646661 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235Knowledge Foundation, 20180010
Available from: 2025-01-24 Created: 2025-01-24 Last updated: 2025-04-03Bibliographically approved
Iftikhar, U., Börstler, J., Ali, N. b. & Kopp, O. (2025). Supporting the identification of prevalent quality issues in code changes by analyzing reviewers’ feedback. Software quality journal, 33(2), Article ID 22.
Open this publication in new window or tab >>Supporting the identification of prevalent quality issues in code changes by analyzing reviewers’ feedback
2025 (English)In: Software quality journal, ISSN 0963-9314, E-ISSN 1573-1367, Vol. 33, no 2, article id 22Article in journal (Refereed) Published
Abstract [en]

Context: Code reviewers provide valuable feedback during the code review. Identifying common issues described in the reviewers’ feedback can provide input for devising context-specific software development improvements. However, the use of reviewer feedback for this purpose is currently less explored.

Objective: In this study, we assess how automation can derive more interpretable and informative themes in reviewers’ feedback and whether these themes help to identify recurring quality-related issues in code changes.

Method: We conducted a participatory case study using the JabRef system to analyze reviewers’ feedback on merged and abandoned code changes. We used two promising topic modeling methods (GSDMM and BERTopic) to identify themes in 5,560 code review comments. The resulting themes were analyzed and named by a domain expert from JabRef.

Results: The domain expert considered the identified themes from the two topic models to represent quality-related issues. Different quality issues are pointed out in code reviews for merged and abandoned code changes. While BERTopic provides higher objective coherence, the domain expert considered themes from short-text topic modeling more informative and easy to interpret than BERTopic-based topic modeling.

Conclusions: The identified prevalent code quality issues aim to address the maintainability-focused issues. The analysis of code review comments can enhance the current practices for JabRef by improving the guidelines for new developers and focusing discussions in the developer forums. The topic model choice impacts the interpretability of the generated themes, and a higher coherence (based on objective measures) of generated topics did not lead to improved interpretability by a domain expert. 

Place, publisher, year, edition, pages
Springer, 2025
Keywords
Modern code review, Natural language processing, Open-source systems, Software quality improvement, Computer software selection and evaluation, Open source software, Software design, Code changes, Code review, Domain experts, Language processing, Natural languages, Open source system, Software quality improvements, Topic Modeling, Software quality
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-27789 (URN)10.1007/s11219-025-09720-9 (DOI)001473057800001 ()2-s2.0-105003288015 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235
Available from: 2025-05-02 Created: 2025-05-02 Last updated: 2025-05-02Bibliographically approved
Tran, H. K. (2025). Towards Reliable Eager Test Detection: Practitioner Validation and a Tool Prototype. In: : . Paper presented at 8th Workshop on Validation, Analysis and Evolution of Software Tests, Montréal, Canada, March 04, 2025.
Open this publication in new window or tab >>Towards Reliable Eager Test Detection: Practitioner Validation and a Tool Prototype
2025 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Context: Existing tools for detecting eager tests produce many false positives, rendering them unreliable for practitioners. To address this, our previous work introduced a novel definition of the Eager Test smell and a heuristic for more effective identification. Comparing the heuristic’s results with existing detection rules revealed eight test patterns where the rules misclassified the presence or absence of eager tests.

Objective: We aim to gather practitioners’ feedback on our heuristic’s assessment of these eight test patterns and operationalize the heuristic in a tool we named EagerID.

Method: We conducted a survey to collect practitioners’ feedback on the eight identified test patterns and developed EagerID to detect eager tests in Java unit test cases using JUnit. We also preliminarily evaluated EagerID on 300 test cases, which were manually analyzed in our previous study.

Results: Our survey received 23 responses from practitioners with a wide range of experience. We found that most practitioners agreed with the assessment of our heuristic. Furthermore, the preliminary evaluation of EagerID returned high precision (100%), recall (91.76%), and F-Score (95.70%).

Conclusion: Our survey findings highlight the practical relevance of the heuristic. The preliminary evaluation of the EagerID tool confirmed the heuristic’s potential for automation. These findings suggest that the heuristic provides a solid foundation for both manual and automated detection.

Keywords
Software testing, Test case quality, Test suite quality, Quality assurance, Test smells, Unit testing, Eager Test, Detection tool, Java, JUnit
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-27674 (URN)
Conference
8th Workshop on Validation, Analysis and Evolution of Software Tests, Montréal, Canada, March 04, 2025
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235Knowledge Foundation, 20180010
Available from: 2025-03-31 Created: 2025-03-31 Last updated: 2025-04-04Bibliographically approved
Börstler, J., Ali, N. b., Petersen, K. & Engström, E. (2024). Acceptance behavior theories and models in software engineering — A mapping study. Information and Software Technology, 172, Article ID 107469.
Open this publication in new window or tab >>Acceptance behavior theories and models in software engineering — A mapping study
2024 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 172, article id 107469Article in journal (Refereed) Published
Abstract [en]

Context: The adoption or acceptance of new technologies or ways of working in software development activities is a recurrent topic in the software engineering literature. The topic has, therefore, been empirically investigated extensively. It is, however, unclear which theoretical frames of reference are used in this research to explain acceptance behaviors. Objective: In this study, we explore how major theories and models of acceptance behavior have been used in the software engineering literature to empirically investigate acceptance behavior.Method: We conduct a systematic mapping study of empirical studies using acceptance behavior theories in software engineering.Results: We identified 47 primary studies covering 56 theory uses. The theories were categorized into six groups. Technology acceptance models (TAM and its extensions) were used in 29 of the 47 primary studies, innovation theories in 10, and the theories of planned behavior/ reasoned action (TPB/TRA) in six. All other theories were used in at most two of the primary studies. The usage and operationalization of the theories were, in many cases, inconsistent with the underlying theories. Furthermore, we identified 77 constructs used by these studies of which many lack clear definitions. Conclusions: Our results show that software engineering researchers are aware of some of the leading theories and models of acceptance behavior, which indicates an attempt to have more theoretical foundations. However, we identified issues related to theory usage that make it difficult to aggregate and synthesize results across studies. We propose mitigation actions that encourage the consistent use of theories and emphasize the measurement of key constructs.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Acceptance behavior, Technology adoption, Theory use in software engineering, TAM, TPB, TRA, Fitness, Innovation diffusion
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-26143 (URN)10.1016/j.infsof.2024.107469 (DOI)001233663200001 ()2-s2.0-85190986067 (Scopus ID)
Projects
ELLIIT
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235
Available from: 2024-04-24 Created: 2024-04-24 Last updated: 2024-06-18Bibliographically approved
Rico, S., Ali, N. b., Engström, E. & Höst, M. (2024). Experiences from conducting rapid reviews in collaboration with practitioners — Two industrial cases. Information and Software Technology, 167, Article ID 107364.
Open this publication in new window or tab >>Experiences from conducting rapid reviews in collaboration with practitioners — Two industrial cases
2024 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 167, article id 107364Article in journal (Refereed) Published
Abstract [en]

Context: Evidence-based software engineering (EBSE) aims to improve research utilization in practice. It relies on systematic methods to identify, appraise, and synthesize existing research findings to answer questions of interest for practice. However, the lack of practitioners’ involvement in these studies’ design, execution, and reporting indicates a lack of appreciation for the need for knowledge exchange between researchers and practitioners. The resultant systematic literature studies often lack relevance for practice. Objective: This paper explores the use of Rapid Reviews (RRs), in fostering knowledge exchange between academia and industry. Through the lens of two case studies, we delve into the practical application and experience of conducting RRs. Methods: We analyzed the conduct of two rapid reviews by two different groups of researchers and practitioners. We collected data through interviews, and the documents produced during the review (like review protocols, search results, and presentations). The interviews were analyzed using thematic analysis. Results: We report how the two groups of researchers and practitioners performed the rapid reviews. We observed some benefits, like promoting dialogue and paving the way for future collaborations. We also found that practitioners entrusted the researchers to develop and follow a rigorous approach and were more interested in the applicability of the findings in their context. The problems investigated in these two cases were relevant but not the most immediate ones. Therefore, rapidness was not a priority for the practitioners. Conclusion: The study illustrates that rapid reviews can support researcher-practitioner communication and industry-academia collaboration. Furthermore, the recommendations based on the experiences from the two cases complement the detailed guidelines researchers and practitioners may follow to increase interaction and knowledge exchange. © 2023 The Author(s)

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Industry-academia collaboration, Literature reviews, Rapid reviews, Research relevance, Systematic review, Industrial research, Software engineering, Evidence Based Software Engineering, Knowledge exchange, Literature studies, Rapid review, Study design, Systematic method, Knowledge management
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-25786 (URN)10.1016/j.infsof.2023.107364 (DOI)001128629400001 ()2-s2.0-85178453626 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235
Available from: 2023-12-15 Created: 2023-12-15 Last updated: 2024-02-22Bibliographically approved
Unterkalmsteiner, M., Badampudi, D., Britto, R. & Ali, N. b. (2024). Help Me to Understand this Commit! - A Vision for Contextualized Code Reviews. In: Proceedings - 2024 1st IDE Workshop, IDE 2024: . Paper presented at 1st Integrated Development Environments Workshop, IDE 2024, Lisbon, April 20 2024 (pp. 18-23). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Help Me to Understand this Commit! - A Vision for Contextualized Code Reviews
2024 (English)In: Proceedings - 2024 1st IDE Workshop, IDE 2024, Association for Computing Machinery (ACM), 2024, p. 18-23Conference paper, Published paper (Refereed)
Abstract [en]

Background: Modern Code Review (MCR) is a key component for delivering high-quality software and sharing knowledge among developers. Effective reviews require an in-depth understanding of the code and demand from the reviewers to contextualize the change from different perspectives.

Aim: While there is a plethora of research on solutions that support developers to understand changed code, we have observed that many provide only narrow, specialized insights and very few aggregate information in a meaningful manner. Therefore, we aim to provide a vision of improving code understanding in MCR.

Method: We classified 53 research papers suggesting proposals to improve MCR code understanding. We use this classification, the needs expressed by code reviewers from previous research, and the information we have not found in the literature for extrapolation.

Results: We identified four major types of support systems and suggest an environment for contextualized code reviews. Furthermore, we illustrate with a set of scenarios how such an environment would improve the effectiveness of code reviews.

Conclusions: Current research focuses mostly on providing narrow support for developers. We outline a vision for how MCR can be improved by using context and reducing the cognitive load on developers. We hope our vision can foster future advancements in development environments. 

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
code understanding, decision-making, modern code reviews, support systems, Reviews, Classifieds, Code review, Contextualize, Decisions makings, High-quality software, In-depth understanding, Modern code review, Sharing knowledge, Decision making
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-26892 (URN)10.1145/3643796.3648447 (DOI)001297920700005 ()2-s2.0-85202436597 (Scopus ID)9798400705809 (ISBN)
Conference
1st Integrated Development Environments Workshop, IDE 2024, Lisbon, April 20 2024
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235Knowledge Foundation, 20180010
Available from: 2024-09-10 Created: 2024-09-10 Last updated: 2024-10-04Bibliographically approved
Laiq, M., Ali, N. b., Börstler, J. & Engström, E. (2024). Industrial adoption of machine learning techniques for early identification of invalid bug reports. Empirical Software Engineering, 29(5), Article ID 130.
Open this publication in new window or tab >>Industrial adoption of machine learning techniques for early identification of invalid bug reports
2024 (English)In: Empirical Software Engineering, ISSN 1382-3256, E-ISSN 1573-7616, Vol. 29, no 5, article id 130Article in journal (Refereed) Published
Abstract [en]

Despite the accuracy of machine learning (ML) techniques in predicting invalid bug reports, as shown in earlier research, and the importance of early identification of invalid bug reports in software maintenance, the adoption of ML techniques for this task in industrial practice is yet to be investigated. In this study, we used a technology transfer model to guide the adoption of an ML technique at a company for the early identification of invalid bug reports. In the process, we also identify necessary conditions for adopting such techniques in practice. We followed a case study research approach with various design and analysis iterations for technology transfer activities. We collected data from bug repositories, through focus groups, a questionnaire, and a presentation and feedback session with an expert. As expected, we found that an ML technique can identify invalid bug reports with acceptable accuracy at an early stage. However, the technique’s accuracy drops over time in its operational use due to changes in the product, the used technologies, or the development organization. Such changes may require retraining the ML model. During validation, practitioners highlighted the need to understand the ML technique’s predictions to trust the predictions. We found that a visual (using a state-of-the-art ML interpretation framework) and descriptive explanation of the prediction increases the trustability of the technique compared to just presenting the results of the validity predictions. We conclude that trustability, integration with the existing toolchain, and maintaining the techniques’ accuracy over time are critical for increasing the likelihood of adoption. © The Author(s) 2024.

Place, publisher, year, edition, pages
Springer, 2024
Keywords
Concept drift, Defect classification, Invalid bug reports, Machine learning, Software maintenance, Software quality, Computer software maintenance, Computer software selection and evaluation, Industrial research, Technology transfer, Bug reports, Concept drifts, Industrial adoption, Industrial practices, Invalid bug report, Machine learning techniques, Machine-learning, Transfer models, Forecasting
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-26802 (URN)10.1007/s10664-024-10502-3 (DOI)001283245300001 ()2-s2.0-85200034314 (Scopus ID)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20220235
Available from: 2024-08-14 Created: 2024-08-14 Last updated: 2024-12-03Bibliographically approved
Principal InvestigatorAli, Nauman bin
Coordinating organisation
Blekinge Institute of Technology
Funder
Period
2023-10-01 - 2027-10-31
National Category
Software Engineering
Identifiers
DiVA, id: project:7239Project, id: 20220235

Search in DiVA

Software Engineering

Search outside of DiVA

GoogleGoogle Scholar

Link to external project page

Länk till projektsidan