Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-Label Requirements Classification with Large Taxonomies
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0001-8142-9631
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0003-4118-0952
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0003-3567-9300
Hochtief ViCon GmbH, Germany.
Show others and affiliations
2024 (English)In: Proceedings of the IEEE International Conference on Requirements Engineering / [ed] Liebel G., Hadar I., Spoletini P., IEEE Computer Society, 2024, p. 264-274Conference paper, Published paper (Refereed)
Abstract [en]

Context and motivation:

Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification.

Question/problem:

Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies.

Principal ideas/results:

We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier.

Contribution:

We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning. © 2024 IEEE.

Place, publisher, year, edition, pages
IEEE Computer Society, 2024. p. 264-274
Series
International Requirements Engineering Conference, ISSN 1090-705X
Keywords [en]
domain-specific tax-onomy, large-scale, multi-label, requirements classification, Multiprogramming, Requirements engineering, Software design, Taxation, Taxonomies, Development activity, Domain specific, Large-scales, Multi-class classification, Multi-label classifications, Multi-labels, Requirements classifications, Requirements traceability, Sentence-based, Zero-shot learning
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-26893DOI: 10.1109/RE59067.2024.00033ISI: 001300544600025Scopus ID: 2-s2.0-85202739889ISBN: 9798350395112 (print)OAI: oai:DiVA.org:bth-26893DiVA, id: diva2:1896290
Conference
32nd IEEE International Requirements Engineering Conference, RE 2024, Reykjavik, June 24-28 2024
Available from: 2024-09-10 Created: 2024-09-10 Last updated: 2025-09-30Bibliographically approved
In thesis
1. Taxonomic Trace Links in Requirements Engineering
Open this publication in new window or tab >>Taxonomic Trace Links in Requirements Engineering
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Background: Software engineering is a knowledge-intensive activity that requires engineers to manage information to collaborate efficiently and effectively. Within Software Engineering, the Requirements Engineering process bridges the knowledge gap between the customer and the development team by eliciting, managing, and communicating product requirements. The traceability of these requirements supports developers in producing higher-quality software that aligns with customer needs. In addition, traceability supports other activities, such as change impact analysis, software quality assurance, and requirements-based verification.

Problem: Despite decades of research on traceability, practical challenges still hinder the adoption of traceability in practice. This signals a need for new ways of practicing traceability that fit real-world needs. 

Goal: Building on previous work, this thesis instantiates, develops, and empirically evaluates Taxonomic Trace Links, a new way to trace requirements to various software artifacts through domain knowledge captured in a taxonomy. 

Method: The studies included in this theses follows mixed research methods, which are case study, systematic mapping studies, validation study, controlled experiments, and focus groups.

Results: The current state of practice in customer-supplier communication shows persistent challenges that we mapped to solutions in the literature. Our literature study shows that traceability through domain-specific taxonomies has not been empirically evaluated. Our development and evaluation of the technical solution for taxonomic trace links show that semi-automation of trace link creation and maintenance is possible. Finally, our empirical evaluation of taxonomic trace links shows that the solution is feasible in practice and can create trace links for multiple purposes.

Conclusion: Traceability between software artifacts has more benefits than currently realized by practitioners. However, current traceability solutions, based on direct trace links, do not appear to be easily adapted in different scenarios to trace different artifacts. Taxonomic trace links are an alternative approach that could overcome the shortcomings of direct trace links. 

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2025. p. 187
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:08
Keywords
requirements, traceability, domain-knowledge, taxonomy
National Category
Software Engineering
Research subject
Software Engineering
Identifiers
urn:nbn:se:bth-28451 (URN)978-91-7295-504-2 (ISBN)
Public defence
2025-10-07, C413A, Karlskrona, 13:00 (English)
Opponent
Supervisors
Available from: 2025-08-07 Created: 2025-08-07 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

multi_arxiv(554 kB)43 downloads
File information
File name FULLTEXT01.pdfFile size 554 kBChecksum SHA-512
5ff917463dd6cb3c75446bfb89c490b648bb2422a982636b311ae7bf264bcaf29c603152d2aa7dfa28c79718aaec8d6e2c560008e096d2f4a7220411d676a5be
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusarXiv.org

Authority records

Abdeen, WaleedUnterkalmsteiner, MichaelWnuk, Krzysztof

Search in DiVA

By author/editor
Abdeen, WaleedUnterkalmsteiner, MichaelWnuk, Krzysztof
By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 43 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 634 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf