Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Vowel segmentation impact on machine learning classification for chronic obstructive pulmonary disease
Blekinge Tekniska Högskola, Fakulteten för teknikvetenskaper, Institutionen för hälsa.ORCID-id: 0000-0003-1558-2309
Blekinge Tekniska Högskola, Fakulteten för teknikvetenskaper, Institutionen för hälsa.ORCID-id: 0000-0002-6752-017X
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0002-4390-411X
Blekinge Tekniska Högskola, Fakulteten för teknikvetenskaper, Institutionen för hälsa.ORCID-id: 0000-0001-9870-8477
Vise andre og tillknytning
2025 (engelsk)Inngår i: Scientific Reports, E-ISSN 2045-2322, Vol. 15, nr 1, artikkel-id 9930Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Vowel-based voice analysis is gaining attention as a potential non-invasive tool for COPD classification, offering insights into phonatory function. The growing need for voice data has necessitated the adoption of various techniques, including segmentation, to augment existing datasets for training comprehensive Machine Learning (ML) models

This study aims to investigate the possible effects of segmentation of the utterance of vowel "a" on the performance of ML classifiers CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). This research involves training individual ML models using three distinct dataset constructions: full-sequence, segment-wise, and group-wise, derived from the utterance of the vowel "a" which consists of 1058 recordings belonging to 48 participants. This approach comprehensively analyzes how each data categorization impacts the model's performance and results.

A nested cross-validation (nCV) approach was implemented with grid search for hyperparameter optimization. This rigorous methodology was employed to minimize overfitting risks and maximize model performance. Compared to the full-sequence dataset, the findings indicate that the second segment yielded higher results within the four-segment category. Specifically, the CB model achieved superior accuracy, attaining 97.8% and 84.6% on the validation and test sets, respectively. The same category for the CB model also demonstrated the best balance regarding true positive rate (TPR) and true negative rate (TNR), making it the most clinically effective choice.

These findings suggest that time-sensitive properties in vowel production are important for COPD classification and that segmentation can aid in capturing these properties. Despite these promising results, the dataset size and demographic homogeneity limit generalizability, highlighting areas for future research. Trial registration The study is registered on clinicaltrials.gov with ID: NCT06160674. 

sted, utgiver, år, opplag, sider
Nature Publishing Group, 2025. Vol. 15, nr 1, artikkel-id 9930
Emneord [en]
Chronic obstructive pulmonary disease (COPD), Classification, Machine learning, Vowel segmentation
HSV kategori
Identifikatorer
URN: urn:nbn:se:bth-27695DOI: 10.1038/s41598-025-95320-3ISI: 001504610200010Scopus ID: 2-s2.0-105000630528OAI: oai:DiVA.org:bth-27695DiVA, id: diva2:1950115
Forskningsfinansiär
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsTilgjengelig fra: 2025-04-04 Laget: 2025-04-04 Sist oppdatert: 2025-10-28bibliografisk kontrollert
Inngår i avhandling
1. Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
Åpne denne publikasjonen i ny fane eller vindu >>Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
2025 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity and mortality worldwide, with high underdiagnosis rates due to limitations in current diagnostic methods such as spirometry. This doctoral thesis explores the potential of voice as a digital biomarker to support the assessment of COPD, guided by the principles of Applied Health Technology (AHT), which emphasizes interdisciplinary collaboration and real-world applicability.

The research includes four interconnected studies. Study I presents a systematic literature review of machine learning (ML) applications for voice-affecting disorders, identifying COPD as underrepresented in current research. Study II addresses this gap by collecting a new dataset of vowel [a:] recordings from Swedish-speaking COPD patients and healthy controls once a week in self-determined quiet settings. Voice features, including baseline acoustic (BLA) parameters and Mel-Frequency Cepstral Coefficients (MFCCs), were extracted and used to train three ML classifiers: CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). CB demonstrated the highest test accuracy at 78%. 

Study III investigates the effects of signal segmentation on model performance and shows that certain temporal segments of voice recordings contain more informative patterns, enhancing classification outcomes by increasing accuracy to 85%. Study IV applies statistical and practical significance tests to compare voice features between COPD and healthy groups. A total of 34 features, including shimmer measures and higher-order MFCC derivatives, were found to meaningfully differentiate the groups. 

This thesis reframes the human voice as a source of clinically relevant data, demonstrating how it can be digitized, analyzed, and interpreted using ML to aid COPD assessment. The results indicate that voice-based analysis can provide an accessible, non-invasive, and scalable complement to existing diagnostic tools. By integrating technical, clinical, and ethical perspectives, the thesis contributes new knowledge and practical methodologies that align with AHT's goal of creating value-driven, user-centered healthcare solutions. The findings support future development of mobile and remote voice-based screening tools for COPD and other conditions.

sted, utgiver, år, opplag, sider
Karlskrona: Blekinge Tekniska Högskola, 2025. s. 160
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:07
Emneord
Chronic Obstructive Pulmonary Disease, Machine Learning, Noninvasive Diagnostic, Segmentation, Voice-based Analysis
HSV kategori
Forskningsprogram
Tillämpad hälsoteknik
Identifikatorer
urn:nbn:se:bth-28038 (URN)978-91-7295-503-5 (ISBN)
Disputas
2025-10-15, J1630, Valhallavägen 1, Karlskrona, 13:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2025-08-11 Laget: 2025-06-10 Sist oppdatert: 2025-10-28bibliografisk kontrollert

Open Access i DiVA

fulltext(2705 kB)60 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 2705 kBChecksum SHA-512
ed1d6222cc4a29f963cc2fbe4907c1c32c395cefcbac03139c8ff1d46262db6c797677f346680feaa7e3e24377c4ea2638e868202b81fce65f304676710bea8b
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan

Søk i DiVA

Av forfatter/redaktør
Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan
Av organisasjonen
I samme tidsskrift
Scientific Reports

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 62 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 773 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf