Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Vowel segmentation impact on machine learning classification for chronic obstructive pulmonary disease
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0003-1558-2309
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0002-6752-017X
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0002-4390-411X
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0001-9870-8477
Show others and affiliations
2025 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 15, no 1, article id 9930Article in journal (Refereed) Published
Abstract [en]

Vowel-based voice analysis is gaining attention as a potential non-invasive tool for COPD classification, offering insights into phonatory function. The growing need for voice data has necessitated the adoption of various techniques, including segmentation, to augment existing datasets for training comprehensive Machine Learning (ML) models

This study aims to investigate the possible effects of segmentation of the utterance of vowel "a" on the performance of ML classifiers CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). This research involves training individual ML models using three distinct dataset constructions: full-sequence, segment-wise, and group-wise, derived from the utterance of the vowel "a" which consists of 1058 recordings belonging to 48 participants. This approach comprehensively analyzes how each data categorization impacts the model's performance and results.

A nested cross-validation (nCV) approach was implemented with grid search for hyperparameter optimization. This rigorous methodology was employed to minimize overfitting risks and maximize model performance. Compared to the full-sequence dataset, the findings indicate that the second segment yielded higher results within the four-segment category. Specifically, the CB model achieved superior accuracy, attaining 97.8% and 84.6% on the validation and test sets, respectively. The same category for the CB model also demonstrated the best balance regarding true positive rate (TPR) and true negative rate (TNR), making it the most clinically effective choice.

These findings suggest that time-sensitive properties in vowel production are important for COPD classification and that segmentation can aid in capturing these properties. Despite these promising results, the dataset size and demographic homogeneity limit generalizability, highlighting areas for future research. Trial registration The study is registered on clinicaltrials.gov with ID: NCT06160674. 

Place, publisher, year, edition, pages
Nature Publishing Group, 2025. Vol. 15, no 1, article id 9930
Keywords [en]
Chronic obstructive pulmonary disease (COPD), Classification, Machine learning, Vowel segmentation
National Category
Artificial Intelligence Respiratory Medicine and Allergy
Identifiers
URN: urn:nbn:se:bth-27695DOI: 10.1038/s41598-025-95320-3ISI: 001504610200010Scopus ID: 2-s2.0-105000630528OAI: oai:DiVA.org:bth-27695DiVA, id: diva2:1950115
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsAvailable from: 2025-04-04 Created: 2025-04-04 Last updated: 2025-10-28Bibliographically approved
In thesis
1. Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
Open this publication in new window or tab >>Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity and mortality worldwide, with high underdiagnosis rates due to limitations in current diagnostic methods such as spirometry. This doctoral thesis explores the potential of voice as a digital biomarker to support the assessment of COPD, guided by the principles of Applied Health Technology (AHT), which emphasizes interdisciplinary collaboration and real-world applicability.

The research includes four interconnected studies. Study I presents a systematic literature review of machine learning (ML) applications for voice-affecting disorders, identifying COPD as underrepresented in current research. Study II addresses this gap by collecting a new dataset of vowel [a:] recordings from Swedish-speaking COPD patients and healthy controls once a week in self-determined quiet settings. Voice features, including baseline acoustic (BLA) parameters and Mel-Frequency Cepstral Coefficients (MFCCs), were extracted and used to train three ML classifiers: CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). CB demonstrated the highest test accuracy at 78%. 

Study III investigates the effects of signal segmentation on model performance and shows that certain temporal segments of voice recordings contain more informative patterns, enhancing classification outcomes by increasing accuracy to 85%. Study IV applies statistical and practical significance tests to compare voice features between COPD and healthy groups. A total of 34 features, including shimmer measures and higher-order MFCC derivatives, were found to meaningfully differentiate the groups. 

This thesis reframes the human voice as a source of clinically relevant data, demonstrating how it can be digitized, analyzed, and interpreted using ML to aid COPD assessment. The results indicate that voice-based analysis can provide an accessible, non-invasive, and scalable complement to existing diagnostic tools. By integrating technical, clinical, and ethical perspectives, the thesis contributes new knowledge and practical methodologies that align with AHT's goal of creating value-driven, user-centered healthcare solutions. The findings support future development of mobile and remote voice-based screening tools for COPD and other conditions.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2025. p. 160
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:07
Keywords
Chronic Obstructive Pulmonary Disease, Machine Learning, Noninvasive Diagnostic, Segmentation, Voice-based Analysis
National Category
Respiratory Medicine and Allergy Medical and Health Sciences Medical Informatics
Research subject
Applied Health Technology
Identifiers
urn:nbn:se:bth-28038 (URN)978-91-7295-503-5 (ISBN)
Public defence
2025-10-15, J1630, Valhallavägen 1, Karlskrona, 13:00 (English)
Opponent
Supervisors
Available from: 2025-08-11 Created: 2025-06-10 Last updated: 2025-10-28Bibliographically approved

Open Access in DiVA

fulltext(2705 kB)60 downloads
File information
File name FULLTEXT01.pdfFile size 2705 kBChecksum SHA-512
ed1d6222cc4a29f963cc2fbe4907c1c32c395cefcbac03139c8ff1d46262db6c797677f346680feaa7e3e24377c4ea2638e868202b81fce65f304676710bea8b
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan

Search in DiVA

By author/editor
Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan
By organisation
Department of HealthDepartment of Computer Science
In the same journal
Scientific Reports
Artificial IntelligenceRespiratory Medicine and Allergy

Search outside of DiVA

GoogleGoogle Scholar
Total: 62 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 762 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf