Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Taxonomic Trace Links in Requirements Engineering
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0001-8142-9631
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Background: Software engineering is a knowledge-intensive activity that requires engineers to manage information to collaborate efficiently and effectively. Within Software Engineering, the Requirements Engineering process bridges the knowledge gap between the customer and the development team by eliciting, managing, and communicating product requirements. The traceability of these requirements supports developers in producing higher-quality software that aligns with customer needs. In addition, traceability supports other activities, such as change impact analysis, software quality assurance, and requirements-based verification.

Problem: Despite decades of research on traceability, practical challenges still hinder the adoption of traceability in practice. This signals a need for new ways of practicing traceability that fit real-world needs. 

Goal: Building on previous work, this thesis instantiates, develops, and empirically evaluates Taxonomic Trace Links, a new way to trace requirements to various software artifacts through domain knowledge captured in a taxonomy. 

Method: The studies included in this theses follows mixed research methods, which are case study, systematic mapping studies, validation study, controlled experiments, and focus groups.

Results: The current state of practice in customer-supplier communication shows persistent challenges that we mapped to solutions in the literature. Our literature study shows that traceability through domain-specific taxonomies has not been empirically evaluated. Our development and evaluation of the technical solution for taxonomic trace links show that semi-automation of trace link creation and maintenance is possible. Finally, our empirical evaluation of taxonomic trace links shows that the solution is feasible in practice and can create trace links for multiple purposes.

Conclusion: Traceability between software artifacts has more benefits than currently realized by practitioners. However, current traceability solutions, based on direct trace links, do not appear to be easily adapted in different scenarios to trace different artifacts. Taxonomic trace links are an alternative approach that could overcome the shortcomings of direct trace links. 

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2025. , p. 187
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:08
Keywords [en]
requirements, traceability, domain-knowledge, taxonomy
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
URN: urn:nbn:se:bth-28451ISBN: 978-91-7295-504-2 (print)OAI: oai:DiVA.org:bth-28451DiVA, id: diva2:1987769
Public defence
2025-10-07, C413A, Karlskrona, 13:00 (English)
Opponent
Supervisors
Part of project
D-CAT – Digital Collaboration and Automized Tracing Of Information, Swedish Transport AdministrationSERT- Software Engineering ReThought, Knowledge FoundationAvailable from: 2025-08-07 Created: 2025-08-07 Last updated: 2025-09-30Bibliographically approved
List of papers
1. Challenges of Requirements Communication and Digital Assets Verification in Infrastructure Projects
Open this publication in new window or tab >>Challenges of Requirements Communication and Digital Assets Verification in Infrastructure Projects
2025 (English)In: e-Informatica Software Engineering Journal, ISSN 1897-7979, E-ISSN 2084-4840, Vol. 19, no 1, p. 250107-250107Article in journal (Refereed) Published
Abstract [en]

Background: Poor communication of requirements between clients and suppliers contributes to project overruns, in both software and infrastructure projects. Existing literature offers limited insights into the communication challenges at this interface.

Aim: Our research aim to explore the processes and associated challenges with requirements activities that include client-supplier interaction and communication.

Method: we study requirements validation, communication, and digital asset verification processes through two case studies in the road and railway sectors, involving interviews with ten experts across three companies.

Results: We identify 13 challenges, along with their causes and consequences, and suggest solution areas from existing literature.

Conclusion: Interestingly, the challenges in infrastructure projects mirror those found in software engineering, highlighting a need for further research to validate potential solutions.

Keywords
infrastructure, requirements, digital assets, verification, validation
National Category
Software Engineering Infrastructure Engineering
Research subject
Systems Engineering; Software Engineering
Identifiers
urn:nbn:se:bth-28447 (URN)10.37190/e-inf250107 (DOI)2-s2.0-105015147030 (Scopus ID)
Available from: 2025-08-05 Created: 2025-08-05 Last updated: 2025-11-28Bibliographically approved
2. Auxiliary Artifacts in Requirements Traceability: A Systematic Mapping Study
Open this publication in new window or tab >>Auxiliary Artifacts in Requirements Traceability: A Systematic Mapping Study
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Background: Traceability between software artifacts enhances the value of the information those artifacts contain, but only when the links themselves are reliable. Link quality is known to depend on explicit factors such as the traced artifacts and the expertise of the practitioner who judges each connection. Other factors, however, remain largely unexplored. We contend that one of these factors is the set of auxiliary artifacts---artifacts that are produced and/or used during the tracing process, yet are neither the source nor target artifacts. Because such auxiliary artifacts can subtly steer how links are created and validated, they merit a literature survey to identify these artifacts and further investigate them.

Objective: We identify and map auxiliary artifacts used in requirements tracing, which could be additional factors that affect the quality of the trace links.

Method: We conducted a systematic mapping study on auxiliary artifacts in requirements traceability.

Results: We found 110 studies in which auxiliary artifacts are used in requirements tracing, and identified 49 auxiliary artifacts, and 13 usage scenarios.

Conclusion: This study provides a systematic mapping of auxiliary artifacts in requirement tracing, including their usage, origin, type and tool support. The use of auxiliary artifacts in requirements tracing seems to be the norm, thus, these artifacts should be studied in depth to identify how they effect the quality of traced links.

Keywords
requirements, traceability, auxiliary artifacts
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-28443 (URN)
Funder
Swedish Transport Administration, D-CAT
Available from: 2025-08-05 Created: 2025-08-05 Last updated: 2025-09-30Bibliographically approved
3. Multi-Label Requirements Classification with Large Taxonomies
Open this publication in new window or tab >>Multi-Label Requirements Classification with Large Taxonomies
Show others...
2024 (English)In: Proceedings of the IEEE International Conference on Requirements Engineering / [ed] Liebel G., Hadar I., Spoletini P., IEEE Computer Society, 2024, p. 264-274Conference paper, Published paper (Refereed)
Abstract [en]

Context and motivation:

Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification.

Question/problem:

Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies.

Principal ideas/results:

We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier.

Contribution:

We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning. © 2024 IEEE.

Place, publisher, year, edition, pages
IEEE Computer Society, 2024
Series
International Requirements Engineering Conference, ISSN 1090-705X
Keywords
domain-specific tax-onomy, large-scale, multi-label, requirements classification, Multiprogramming, Requirements engineering, Software design, Taxation, Taxonomies, Development activity, Domain specific, Large-scales, Multi-class classification, Multi-label classifications, Multi-labels, Requirements classifications, Requirements traceability, Sentence-based, Zero-shot learning
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-26893 (URN)10.1109/RE59067.2024.00033 (DOI)001300544600025 ()2-s2.0-85202739889 (Scopus ID)9798350395112 (ISBN)
Conference
32nd IEEE International Requirements Engineering Conference, RE 2024, Reykjavik, June 24-28 2024
Available from: 2024-09-10 Created: 2024-09-10 Last updated: 2025-09-30Bibliographically approved
4. Language Models to Support Multi-Label Classification of Industrial Data
Open this publication in new window or tab >>Language Models to Support Multi-Label Classification of Industrial Data
Show others...
2025 (English)In: Proceedings - 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2025, Institute of Electrical and Electronics Engineers (IEEE), 2025, p. 45-55Conference paper, Published paper (Refereed)
Abstract [en]

Background:

Multi-label requirements classification is an inherently challenging task, especially when dealing with numerous classes at varying levels of abstraction. The task becomes even more difficult when a limited number of requirements is available to train a supervised classifier.  Zero-shot learning does not require training data and can potentially address this problem.

Objective:

This paper investigates the performance of zero-shot classifiers on a multi-label industrial dataset. The study focuses on classifying requirements according to a hierarchical taxonomy designed to support requirements tracing.

Method:

We compare multiple variants of zero-shot classifiers using different embeddings, including 9 language models (LMs) with a reduced number of parameters (up to 3B), e.g., BERT, and 5 large LMs (LLMs) with a large number of parameters (up to 70B), e.g., Llama. Our ground truth includes 377 requirements and 1968 labels from 6 output spaces. For the evaluation, we adopt traditional metrics, i.e., precision, recall, $F_1$, and $F_\beta$, as well as a novel label distance metric $D_n$. This aims to better capture the classification's hierarchical nature and to provide a more nuanced evaluation of how far the results are from the ground truth.

Results:

1) The top-performing model on 5 out of 6 output spaces is T5-xl, with maximum  $F_\beta = 0.78$ and $D_n = 0.04$, while BERT base outperformed the other models in one case, with maximum $F_\beta = 0.83$ and $D_n = 0.04$. 2) LMs with smaller parameter size produce the best classification results compared to LLMs. Thus, addressing the problem in practice is feasible as limited computing power is needed. 3) The model architecture (autoencoding, autoregression, and sentence-to-sentence) significantly affects the classifier's performance.

Contribution:

We conclude that using zero-shot learning for multi-label requirements classification offers promising results. We also present a novel metric that can be used to select the top-performing model for this problem.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Series
Proceedings of the ... European Conference on Software Maintenance and Reengineering, ISSN 1534-5351
Keywords
multi-label, requirements classification, taxonomy, language models
National Category
Natural Language Processing Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-27813 (URN)10.1109/SANER64311.2025.00013 (DOI)001506888600005 ()2-s2.0-105007293644 (Scopus ID)9798331535100 (ISBN)
Conference
32nd IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2025, Monteral, Mar 4-7, 2025
Funder
Knowledge Foundation, 20180010
Available from: 2025-05-08 Created: 2025-05-08 Last updated: 2025-09-30Bibliographically approved
5. Empirical Evaluation of Taxonomic Trace Links: A Case Study
Open this publication in new window or tab >>Empirical Evaluation of Taxonomic Trace Links: A Case Study
Show others...
2026 (English)In: Empirical Software Engineering, ISSN 1382-3256, E-ISSN 1573-7616, Vol. 31, no 2, article id 34Article in journal (Refereed) Published
Abstract [en]

Context: Traceability is a key quality attribute of artifacts that are used in knowledge-intensive tasks and supports software engineers in producing higher-quality software. Despite its clear benefits, traceability is often neglected in practice due to challenges such as granularity of traces, lack of a common artifact structure, and unclear responsibility. The Taxonomic Trace Links (TTL) approach connects source and target artifacts through a domain-specific taxonomy, aiming to address these common traceability challenges.

Objective: In this study, we empirically evaluate TTL in an industrial setting to identify its strengths and weaknesses for real-world adoption.

Method: We conducted a mixed-methods study at Ericsson involving one of its software products. Quantitative and qualitative data were collected across two traceability use cases. We established trace links between 463 business use cases, 64 test cases, and 277 ISO-standard requirements. Additionally, we held three focus group sessions with practitioners.

Results: We identified two practically relevant scenarios where traceability is required and evaluated TTL in each. Overall, practitioners found TTL to be a useful solution for identifying trace links with reasonable effort. However, developing a domain-specific taxonomy and managing heterogeneous artifact structures were noted as significant challenges.

Conclusion: TTL is a promising approach that can be adopted in practice and enables traceability use cases. However, TTL are not a replacement for traditional trace links, but complementary to enable more traceability use cases, and encourage early trace links creation.

Place, publisher, year, edition, pages
Springer, 2026
Keywords
Evaluation, Requirements traceability, Taxonomy, Trace link
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-28444 (URN)10.1007/s10664-025-10764-5 (DOI)001632325800013 ()2-s2.0-105024065754 (Scopus ID)
Available from: 2025-08-05 Created: 2025-08-05 Last updated: 2026-01-05Bibliographically approved
6. Taxonomic Trace Links - Rethinking Traceability and its Benefits
Open this publication in new window or tab >>Taxonomic Trace Links - Rethinking Traceability and its Benefits
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Traceability greatly supports knowledge-intensive tasks, e.g., coverage check and impact analysis. Despite its clear benefits, the practical implementation of traceability poses significant challenges, leading to a reduced focus on the creation and maintenance of trace links. We propose a new approach --- Taxonomic Trace Links (TTL) --- which rethinks traceability and its benefits. With TTL, trace links are created indirectly through a domain-specific taxonomy, a simplified version of a domain model. TTL has the potential to address key traceability challenges, such as the granularity of trace links, the lack of a common data structure among software development artifacts, and unclear responsibility for traceability. We explain how TTL addresses these challenges and perform an initial validation with practitioners. We identified six challenges associated with TTL implementation that need to be addressed. Finally, we propose a research roadmap to further develop and evaluate the technical solution of TTL. TTL appears to be particularly feasible in practice where a domain taxonomy is already established.

Keywords
Requirements traceability, Taxonomy, Domain Modeling, Classification
National Category
Software Engineering
Research subject
Software Engineering; Systems Engineering
Identifiers
urn:nbn:se:bth-28442 (URN)
Funder
Swedish Transport Administration, D-CAT
Available from: 2025-08-05 Created: 2025-08-05 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

fulltext(3868 kB)157 downloads
File information
File name FULLTEXT01.pdfFile size 3868 kBChecksum SHA-512
7447a78e22da400337de8d3b76871b996d72bc48ef721184054bcbf8474eeb86133bcbac64b526464b755c8b8f934525232af52a63e10b7ef650a5ab4ad135ac
Type fulltextMimetype application/pdf

Authority records

Abdeen, Waleed

Search in DiVA

By author/editor
Abdeen, Waleed
By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 157 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1638 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf