Multi-Label Requirements Classification with Large TaxonomiesShow others and affiliations
2024 (English)In: Proceedings of the IEEE International Conference on Requirements Engineering / [ed] Liebel G., Hadar I., Spoletini P., IEEE Computer Society, 2024, p. 264-274Conference paper, Published paper (Refereed)
Abstract [en]
Context and motivation:
Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification.
Question/problem:
Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies.
Principal ideas/results:
We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier.
Contribution:
We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning. © 2024 IEEE.
Place, publisher, year, edition, pages
IEEE Computer Society, 2024. p. 264-274
Series
International Requirements Engineering Conference, ISSN 1090-705X
Keywords [en]
domain-specific tax-onomy, large-scale, multi-label, requirements classification, Multiprogramming, Requirements engineering, Software design, Taxation, Taxonomies, Development activity, Domain specific, Large-scales, Multi-class classification, Multi-label classifications, Multi-labels, Requirements classifications, Requirements traceability, Sentence-based, Zero-shot learning
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-26893DOI: 10.1109/RE59067.2024.00033ISI: 001300544600025Scopus ID: 2-s2.0-85202739889ISBN: 9798350395112 (print)OAI: oai:DiVA.org:bth-26893DiVA, id: diva2:1896290
Conference
32nd IEEE International Requirements Engineering Conference, RE 2024, Reykjavik, June 24-28 2024
2024-09-102024-09-102025-09-30Bibliographically approved
In thesis