Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 98) Show all publications
Nass, M., Alégroth, E. & Feldt, R. (2024). Improving Web Element Localization by Using a Large Language Model. Software testing, verification & reliability, 34(7)
Open this publication in new window or tab >>Improving Web Element Localization by Using a Large Language Model
2024 (English)In: Software testing, verification & reliability, ISSN 0960-0833, E-ISSN 1099-1689, Vol. 34, no 7Article in journal (Refereed) Published
Abstract [en]

Web-based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but don't grasp the context and meaning of elements and words. The emergence of Large Language Models (LLMs) like GPT-4, which can show human-like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top-ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human-like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real-world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analyzed for all instances where the original approach failed to find the right web element. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 39 (out of 804), a 44 percent reduction. Despite its slower execution time and additional costs of using the GPT-4 model, the LLMs human-like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element identification in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs capabilities, limitations, and practical use in GUI testing.

Place, publisher, year, edition, pages
John Wiley & Sons, 2024
Keywords
GUI Testing, Test Automation, Test Case Robustness, Web Element Locators, Large Language Models
National Category
Computer Systems
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-25637 (URN)10.1002/stvr.1893 (DOI)001290853000001 ()2-s2.0-85201296537 (Scopus ID)
Funder
Knowledge Foundation, 20180010
Available from: 2023-11-22 Created: 2023-11-22 Last updated: 2025-01-03Bibliographically approved
Lenberg, P., Feldt, R., Gren, L., Wallgren Tengberg, L. G., Tidefors, I. & Graziotin, D. (2024). Qualitative software engineering research: Reflections and guidelines. Journal of Software: Evolution and Process, 36(6), Article ID e2607.
Open this publication in new window or tab >>Qualitative software engineering research: Reflections and guidelines
Show others...
2024 (English)In: Journal of Software: Evolution and Process, ISSN 2047-7473, E-ISSN 2047-7481, Vol. 36, no 6, article id e2607Article, review/survey (Refereed) Published
Abstract [en]

Researchers are increasingly recognizing the importance of human aspects in software development. Because qualitative methods are used to explore human behavior in-depth, we believe that studies using such methods will become more common. Existing qualitative software engineering guidelines do not cover the full breadth of qualitative methods and the knowledge on how to use them like in social sciences. The purpose of this study was to extend the software engineering community's current body of knowledge regarding available qualitative methods and their quality assurance frameworks and to provide recommendations and guidelines for their use. With the support of an epistemological argument and a survey of the literature, we suggest that future research would benefit from (1) utilizing a broader set of research methods, (2) more strongly emphasizing reflexivity, and (3) employing qualitative guidelines and quality criteria. We present an overview of three qualitative methods commonly used in social sciences but rarely seen in software engineering research, namely interpretative phenomenological analysis, narrative analysis, and discourse analysis. Furthermore, we discuss the meaning of reflexivity in relation to the software engineering context and suggest means of fostering it. Our paper will help software engineering researchers better select and then guide the application of a broader set of qualitative research methods. © 2023 John Wiley & Sons, Ltd.

Place, publisher, year, edition, pages
John Wiley & Sons, 2024
Keywords
behavioral software engineering, COREQ, interpretive phenomenological analysis, narrative and discourse analysis, qualitative research, reflexivity, Application programs, Quality assurance, Social sciences computing, Software design, Discourse analysis, Interpretive phenomenological analyse, Narrative analysis, Phenomenological analysis, Qualitative method, Software engineering research, Behavioral research
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-25412 (URN)10.1002/smr.2607 (DOI)001064285800001 ()2-s2.0-85170850509 (Scopus ID)
Funder
Vinnova, 2017‐04874Swedish Defence Materiel AdministrationSwedish Armed Forces
Available from: 2023-09-22 Created: 2023-09-22 Last updated: 2024-06-24Bibliographically approved
Nass, M., Alégroth, E., Feldt, R. & Coppola, R. (2023). Robust web element identification for evolving applications by considering visual overlaps. In: Proceedings - 2023 IEEE 16th International Conference on Software Testing, Verification and Validation, ICST 2023: . Paper presented at 16th IEEE International Conference on Software Testing, Verification and Validation, ICST 2023, Dublin, 16 April 2023 through 20 April 2023 (pp. 258-268). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Robust web element identification for evolving applications by considering visual overlaps
2023 (English)In: Proceedings - 2023 IEEE 16th International Conference on Software Testing, Verification and Validation, ICST 2023, Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 258-268Conference paper, Published paper (Refereed)
Abstract [en]

Fragile (i.e., non-robust) test execution is a common challenge for automated GUI-based testing of web applications as they evolve. Despite recent progress, there is still room for improvement since test execution failures caused by technical limitations result in unnecessary maintenance costs that limit its effectiveness and efficiency. One of the most reported technical challenges for web-based tests concerns how to reliably locate a web element used by a test script.This paper proposes the novel concept of Visually Overlapping Nodes (VON) that reduces fragility by utilizing the phenomenon that visual web elements (observed by the user) are constructed from multiple web-elements in the Document Object Model (DOM) that overlaps visually.We demonstrate the approach in a tool, VON Similo, which extends the state-of-the-art multi-locator approach (Similo) that is also used as the baseline for an experiment. In the experiment, a ground truth set of 1163 manually collected web element pairs, from different releases of the 40 most popular web applications on the internet, are used to compare the approaches' precision, recall, and accuracy.Our results show that VON Similo provides 94.7% accuracy in identifying a web element in a new release of the same SUT. In comparison, Similo provides 83.8% accuracy.These results demonstrate the applicability of the visually overlapping nodes concept/tool for web element localization in evolving web applications and contribute a novel way of thinking about web element localization in future research on GUI-based testing. © 2023 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
component, formatting, insert, style, styling
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-25063 (URN)10.1109/ICST57152.2023.00032 (DOI)001009201200025 ()2-s2.0-85161886412 (Scopus ID)9781665456661 (ISBN)
Conference
16th IEEE International Conference on Software Testing, Verification and Validation, ICST 2023, Dublin, 16 April 2023 through 20 April 2023
Available from: 2023-06-30 Created: 2023-06-30 Last updated: 2023-11-22Bibliographically approved
Nass, M., Alégroth, E., Feldt, R., Leotta, M. & Ricca, F. (2023). Similarity-based Web Element Localization for Robust Test Automation. ACM Transactions on Software Engineering and Methodology, 32(3), Article ID 75.
Open this publication in new window or tab >>Similarity-based Web Element Localization for Robust Test Automation
Show others...
2023 (English)In: ACM Transactions on Software Engineering and Methodology, ISSN 1049-331X, E-ISSN 1557-7392, Vol. 32, no 3, article id 75Article in journal (Refereed) Published
Abstract [en]

Non-robust (fragile) test execution is a commonly reported challenge in GUI-based test automation, despite much research and several proposed solutions. A test script needs to be resilient to (minor) changes in the tested application but, at the same time, fail when detecting potential issues that require investigation. Test script fragility is a multi-faceted problem. However, one crucial challenge is how to reliably identify and locate the correct target web elements when the website evolves between releases or otherwise fail and report an issue. This article proposes and evaluates a novel approach called similarity-based web element localization (Similo), which leverages information from multiple web element locator parameters to identify a target element using a weighted similarity score. This experimental study compares Similo to a baseline approach for web element localization. To get an extensive empirical basis, we target 48 of the most popular websites on the Internet in our evaluation. Robustness is considered by counting the number of web elements found in a recent website version compared to how many of these existed in an older version. Results of the experiment show that Similo outperforms the baseline; it failed to locate the correct target web element in 91 out of 801 considered cases (i.e., 11%) compared to 214 failed cases (i.e., 27%) for the baseline approach. The time efficiency of Similo was also considered, where the average time to locate a web element was determined to be 4 milliseconds. However, since the cost of web interactions (e.g., a click) is typically on the order of hundreds of milliseconds, the additional computational demands of Similo can be considered negligible. This study presents evidence that quantifying the similarity between multiple attributes of web elements when trying to locate them, as in our proposed Similo approach, is beneficial. With acceptable efficiency, Similo gives significantly higher effectiveness (i.e., robustness) than the baseline web element localization approach.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
GUI testing, test automation, test case robustness, web element locators, XPath locators
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-25065 (URN)10.1145/3571855 (DOI)001002573400020 ()
Funder
Knowledge Foundation, 20180010Swedish Research Council, 2015-04913Swedish Research Council, 2020-05272
Available from: 2023-06-30 Created: 2023-06-30 Last updated: 2023-11-22Bibliographically approved
Fagerholm, F., Felderer, M., Fucci, D., Unterkalmsteiner, M., Marculescu, B., Martini, M., . . . Khattak, J. (2022). Cognition in Software Engineering: A Taxonomy and Survey of a Half-Century of Research. ACM Computing Surveys, 54(11)
Open this publication in new window or tab >>Cognition in Software Engineering: A Taxonomy and Survey of a Half-Century of Research
Show others...
2022 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 54, no 11Article in journal (Refereed) Published
Abstract [en]

Cognition plays a fundamental role in most software engineering activities. This article provides a taxonomy of cognitive concepts and a survey of the literature since the beginning of the Software Engineering discipline. The taxonomy comprises the top-level concepts of perception, attention, memory, cognitive load, reasoning, cognitive biases, knowledge, social cognition, cognitive control, and errors, and procedures to assess them both qualitatively and quantitatively. The taxonomy provides a useful tool to filter existing studies, classify new studies, and support researchers in getting familiar with a (sub) area. In the literature survey, we systematically collected and analysed 311 scientific papers spanning five decades and classified them using the cognitive concepts from the taxonomy. Our analysis shows that the most developed areas of research correspond to the four life-cycle stages, software requirements, design, construction, and maintenance. Most research is quantitative and focuses on knowledge, cognitive load, memory, and reasoning. Overall, the state of the art appears fragmented when viewed from the perspective of cognition. There is a lack of use of cognitive concepts that would represent a coherent picture of the cognitive processes active in specific tasks. Accordingly, we discuss the research gap in each cognitive concept and provide recommendations for future research.

Place, publisher, year, edition, pages
ACM Digital Library, 2022
Keywords
Cognition, cognitive concepts, psychology of programming, human factors, measurement, taxonomy
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-23177 (URN)10.1145/3508359 (DOI)000886929000001 ()
Note

open access

Available from: 2022-06-16 Created: 2022-06-16 Last updated: 2023-06-30Bibliographically approved
Tuzun, E., Erdogmus, H., Baldassarre, M. T., Felderer, M., Feldt, R. & Turhan, B. (2022). Ground-Truth Deficiencies in Software Engineering: When Codifying the Past Can Be Counterproductive. IEEE Software, 39(3), 85-95
Open this publication in new window or tab >>Ground-Truth Deficiencies in Software Engineering: When Codifying the Past Can Be Counterproductive
Show others...
2022 (English)In: IEEE Software, ISSN 0740-7459, E-ISSN 1937-4194, Vol. 39, no 3, p. 85-95Article in journal (Refereed) Published
Abstract [en]

In software engineering, the objective function of human decision makers might be influenced by many factors. Relying on historical data as the ground truth may give rise to systems that automate software engineering decisions by mimicking past suboptimal behavior. We describe the problem and offer some strategies. ©IEEE.

Place, publisher, year, edition, pages
IEEE Computer Society, 2022
Keywords
Decision making, Cognitive bias, Engineering decisions, Historical data, Human decisions, Mitigation strategy, Objective functions, Process decisions, Software engineering tools, Software engineering
National Category
Business Administration Production Engineering, Human Work Science and Ergonomics
Identifiers
urn:nbn:se:bth-22971 (URN)10.1109/MS.2021.3098670 (DOI)000811542700012 ()2-s2.0-85111024778 (Scopus ID)
Available from: 2022-05-23 Created: 2022-05-23 Last updated: 2023-06-30Bibliographically approved
Briand, L. & Feldt, R. (2021). JF Welcome: ICSE 2021. In: Proceedings - International Conference on Software Engineering: . Paper presented at 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Virtual, Online, 22 May 2021 through 30 May 2021. IEEE Computer Society
Open this publication in new window or tab >>JF Welcome: ICSE 2021
2021 (English)In: Proceedings - International Conference on Software Engineering, IEEE Computer Society , 2021Conference paper, Published paper (Other academic)
Place, publisher, year, edition, pages
IEEE Computer Society, 2021
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-22191 (URN)10.1109/ICSE43902.2021.00007 (DOI)2-s2.0-85115713675 (Scopus ID)9780738113197 (ISBN)
Conference
43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Virtual, Online, 22 May 2021 through 30 May 2021
Note

Editorial

Available from: 2021-10-10 Created: 2021-10-10 Last updated: 2023-06-30Bibliographically approved
Nass, M., Alégroth, E. & Feldt, R. (2020). On the Industrial Applicability of Augmented Testing: An Empirical Study. In: Proceedings - 2020 IEEE 13th International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2020: . Paper presented at 13th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2020, Porto, Portugal, 23 March 2020 through 27 March 2020 (pp. 364-371). Institute of Electrical and Electronics Engineers Inc., Article ID 9155725.
Open this publication in new window or tab >>On the Industrial Applicability of Augmented Testing: An Empirical Study
2020 (English)In: Proceedings - 2020 IEEE 13th International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2020, Institute of Electrical and Electronics Engineers Inc. , 2020, p. 364-371, article id 9155725Conference paper, Published paper (Refereed)
Abstract [en]

Testing applications with graphical user Interfaces (GUI) is an important but also a time-consuming task in practice. Tools and frameworks for GUI test automation can make the test execution more efficient and lower the manual labor required for regression testing. However, the test scripts used for automated GUI-based testing still require a substantial development effort and are often reported as sensitive to change, leading to frequent and costly maintenance. The efficiency of development, maintenance, and evolution of such tests are thereby dependent on the readability of scripts and the ease-of-use of test tools/frameworks in which the test scripts are defined. To address these shortcomings in existing state-of-practice techniques, a novel technique referred to as Augmented Testing (AT) has been proposed. AT is defined as testing the System Under Test (SUT) through an Augmented GUI that superimposes information on top of the SUT GUI. The Augmented GUI can provide the user with hints, test data, or other support while also observing and recording the tester's interactions. For this study, a prototype tool, called Scout, has been used that adheres to the AT concept that is evaluated in an industrial empirical study. In the evaluation, quasi-experiments and questionnaire surveys are performed in two workshops, with 12 practitioners from two Swedish companies (Ericsson and Inceptive). Results show that Scout can be used to create equivalent test cases faster, with statistical significance, than creating automated scripts in two popular state-of-practice tools. The study concludes that AT has cost-value benefits, applies to industrial-grade software, and overcomes several deficiencies of state-of-practice GUI testing technologies in terms of ease-of-use. © 2020 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2020
Keywords
Augmented Testing, Industrial Case Study, System Testing, Test Automation, Automation, Graphical user interfaces, Surveys, Testing, Verification, Automated scripts, Empirical studies, Graphical user interfaces (GUI), Questionnaire surveys, Regression testing, Statistical significance, System under test, Time-consuming tasks, Software testing
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-20530 (URN)10.1109/ICSTW50294.2020.00065 (DOI)000620795100048 ()9781728110752 (ISBN)
Conference
13th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2020, Porto, Portugal, 23 March 2020 through 27 March 2020
Available from: 2020-10-09 Created: 2020-10-09 Last updated: 2023-11-22Bibliographically approved
de Oliveira Neto, F. G., Torkar, R., Feldt, R., Gren, L., Furia, C. A. & Huang, Z. (2019). Evolution of statistical analysis in empirical software engineering research: Current state and steps forward. Journal of Systems and Software, 156, 246-267
Open this publication in new window or tab >>Evolution of statistical analysis in empirical software engineering research: Current state and steps forward
Show others...
2019 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 156, p. 246-267Article in journal (Refereed) Published
Abstract [en]

Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the practices and trends of statistical analysis in empirical software engineering (ESE), this paper presents a review of a large pool of papers from top-ranked software engineering journals. First, we manually reviewed 161 papers and in the second phase of our method, we conducted a more extensive semi-automatic classification of papers spanning the years 2001–2015 and 5196 papers. Results from both review steps was used to: i) identify and analyse the predominant practices in ESE (e.g., using t-test or ANOVA), as well as relevant trends in usage of specific statistical methods (e.g., nonparametric tests and effect size measures) and, ii) develop a conceptual model for a statistical analysis workflow with suggestions on how to apply different statistical methods as well as guidelines to avoid pitfalls. Lastly, we confirm existing claims that current ESE practices lack a standard to report practical significance of results. We illustrate how practical significance can be discussed in terms of both the statistical analysis and in the practitioner's context. © 2019 Elsevier Inc.

Place, publisher, year, edition, pages
Elsevier Inc., 2019
Keywords
Empirical software engineering, Practical significance, Semi-automated literature review, Statistical methods, Analysis of variance (ANOVA), Automation, Software testing, Statistics, Conceptual model, Empirical data, Literature reviews, Non-parametric test, Semi-automatics, Software engineering journals, Engineering research
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-18603 (URN)10.1016/j.jss.2019.07.002 (DOI)000483658000016 ()2-s2.0-85068745690 (Scopus ID)
Note

open access

Available from: 2019-09-10 Created: 2019-09-10 Last updated: 2023-06-30Bibliographically approved
Kim, J., Feldt, R. & Yoo, S. (2019). Guiding Deep Learning System Testing Using Surprise Adequacy. In: International Conference on Software Engineering: . Paper presented at 41st IEEE/ACM International Conference on Software Engineering (ICSE), MAY 25-31, 2019, Montreal, CANADA (pp. 1039-1049). IEEE
Open this publication in new window or tab >>Guiding Deep Learning System Testing Using Surprise Adequacy
2019 (English)In: International Conference on Software Engineering, IEEE , 2019, p. 1039-1049Conference paper, Published paper (Refereed)
Abstract [en]

Deep Learning (DL) systems are rapidly being adopted in safety and security critical domains, urgently calling for ways to test their correctness and robustness. Testing of DL systems has traditionally relied on manual collection and labelling of data. Recently, a number of coverage criteria based on neuron activation values have been proposed. These criteria essentially count the number of neurons whose activation during the execution of a DL system satisfied certain properties, such as being above predefined thresholds. However, existing coverage criteria are not sufficiently fine grained to capture subtle behaviours exhibited by DL systems. Moreover, evaluations have focused on showing correlation between adversarial examples and proposed criteria rather than evaluating and guiding their use for actual testing of DL systems. We propose a novel test adequacy criterion for testing of DL systems, called Surprise Adequacy for Deep Learning Systems (SADL), which is based on the behaviour of DL systems with respect to their training data. We measure the surprise of an input as the difference in DL system's behaviour between the input and the training data (i.e., what was learnt during training), and subsequently develop this as an adequacy criterion: a good test input should be sufficiently but not overtly surprising compared to training data. Empirical evaluation using a range of DL systems from simple image classifiers to autonomous driving car platforms shows that systematic sampling of inputs based on their surprise can improve classification accuracy of DL systems against adversarial examples by up to 77.5% via retraining.

Place, publisher, year, edition, pages
IEEE, 2019
Series
International Conference on Software Engineering, ISSN 0270-5257
Keywords
Test Adequacy, Deep Learning Systems
National Category
Software Engineering Other Computer and Information Science
Identifiers
urn:nbn:se:bth-20385 (URN)10.1109/ICSE.2019.00108 (DOI)000560373200090 ()978-1-7281-0869-8 (ISBN)
Conference
41st IEEE/ACM International Conference on Software Engineering (ICSE), MAY 25-31, 2019, Montreal, CANADA
Available from: 2020-09-08 Created: 2020-09-08 Last updated: 2023-06-30Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5179-4205

Search in DiVA

Show all publications