Change search
Link to record
Permanent link

Direct link
Publications (2 of 2) Show all publications
Tomic, S., Alégroth, E. & Isaac, M. (2025). Evaluation of the Choice of LLM in a Multi-Agent Solution for GUI-Test Generation. In: Fasolino A.R., Panichella S., Aleti A., Mesbah A. (Ed.), 2025 IEEE Conference on Software Testing, Verification and Validation, ICST 2025: . Paper presented at 18th IEEE Conference on Software Testing, Verification and Validation, ICST 2025, Naples, March 31- April 4, 2025 (pp. 487-497). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Evaluation of the Choice of LLM in a Multi-Agent Solution for GUI-Test Generation
2025 (English)In: 2025 IEEE Conference on Software Testing, Verification and Validation, ICST 2025 / [ed] Fasolino A.R., Panichella S., Aleti A., Mesbah A., Institute of Electrical and Electronics Engineers (IEEE), 2025, p. 487-497Conference paper, Published paper (Refereed)
Abstract [en]

Automated testing, particularly for GUI-based systems, remains a costly and labor-intensive process and prone to errors. Despite advancements in automation, manual testing still dominates in industrial practice, resulting in delays, higher costs, and increased error rates. Large Language Models (LLMs) have shown great potential to automate tasks traditionally requiring human intervention, leveraging their cognitive-like abilities for test generation and evaluation. In this study, we present PathFinder, a Multi-Agent LLM (MALLM) framework that incorporates four agents responsible for (a) perception and summarization, (b) decision-making, (c) input handling and extraction, and (d) validation, which work collaboratively to automate exploratory web-based GUI testing. The goal of this study is to assess how different LLMs, applied to different agents, affect the efficacy of automated exploratory GUI testing. We evaluate PathFinder with three models, Mistral-Nemo, Gemma2, and Llama3.1, on four e-commerce websites. Thus, 27 permutations of the LLMs, across three agents (excluding the validation agent), to test the hypothesis that a solution with multiple agents, each using different LLMs, is more efficacious (efficient and effective) than a multi-agent solution where all agents use the same LLM. The results indicate that the choice of LLM constellation (combination of LLMs) significantly impacts efficacy, suggesting that a single LLM across agents may yield the best balance of efficacy (measured by F1-score). Hypothesis to explain this result include, but are not limited to: improved decision-making consistency and reduced task coordination discrepancies. The contributions of this study are an architecture for MALLM-based GUI testing, empirical results on its performance, and novel insights into how LLM selection impacts the efficacy of automated testing. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Series
IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW, ISSN 2159-4848
Keywords
AI-Assisted Software Testing, Automated Testing, Large Language Models (LLMs), MALLM, Multi-Agent Systems, Ability testing, Autonomous agents, C (programming language), Intelligent agents, Model checking, Software testing, GUI testing, Language model, Large language model, Multi agent, Multi-agent LLM, Multiagent systems (MASs), Software testings, Test generations, Automatic test pattern generation
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-28172 (URN)10.1109/ICST62969.2025.10989038 (DOI)001506893900043 ()2-s2.0-105007519090 (Scopus ID)9798331508142 (ISBN)
Conference
18th IEEE Conference on Software Testing, Verification and Validation, ICST 2025, Naples, March 31- April 4, 2025
Funder
Vinnova, 2024- 00242Knowledge Foundation, 20180010
Available from: 2025-06-23 Created: 2025-06-23 Last updated: 2025-09-30Bibliographically approved
Laudadio, L., Coppola, R., Torchiano, M. & Tomic, S. (2024). ScoutDroid: A Tool for Mobile Augmented Testing with Live Feedback. In: Coppola R., Ardito L., Fraser G., Leotta M. (Ed.), Gamify 2024 - Proceedings of the 3rd ACM International Workshop on Gamification in Software Development, Verification, and Validation, Co-located with: ISSTA 2024: . Paper presented at 3rd ACM International Workshop on Gamification in Software Development, Verification, and Validation, Gamify 2024, Co-located with: ISSTA 2024, Vienna, Sept 17, 2024 (pp. 34-37). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>ScoutDroid: A Tool for Mobile Augmented Testing with Live Feedback
2024 (English)In: Gamify 2024 - Proceedings of the 3rd ACM International Workshop on Gamification in Software Development, Verification, and Validation, Co-located with: ISSTA 2024 / [ed] Coppola R., Ardito L., Fraser G., Leotta M., Association for Computing Machinery (ACM), 2024, p. 34-37Conference paper, Published paper (Refereed)
Abstract [en]

In the evolving landscape of mobile applications, effective and efficient testing methods are crucial for ensuring high-quality user experiences. This paper introduces a novel end-To-end mobile testing technique designed to enhance exploratory testing by incorporating gamification strategies. We developed a plugin that integrates these innovative techniques, aiming to make the testing process more engaging and effective. With the use of a live feedback system, the plugin drives testers to thoroughly explore the application, leading to the discovery of more defects and improved software quality. Preliminary evaluation suggests that this approach could not only increase tester engagement but also improve the detection rate of critical issues. This research highlights the potential of merging exploratory testing with gamification, setting the stage for more dynamic and productive mobile testing methodologies. © 2024 Owner/Author.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Gamification, GUI Testing, Mobile Applications, Software Engineering, Software Testing, Application programs, Software quality, Exploratory testing, High quality, Mobile testing, Plug-ins, Software testings, Testing method, Users' experiences
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-27016 (URN)10.1145/3678869.3685688 (DOI)001323635100006 ()2-s2.0-85205765371 (Scopus ID)9798400711138 (ISBN)
Conference
3rd ACM International Workshop on Gamification in Software Development, Verification, and Validation, Gamify 2024, Co-located with: ISSTA 2024, Vienna, Sept 17, 2024
Available from: 2024-10-18 Created: 2024-10-18 Last updated: 2025-09-30Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4379-6614

Search in DiVA

Show all publications