Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
COPDVD: Automated classification of chronic obstructive pulmonary disease on a new collected and evaluated voice dataset
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0003-1558-2309
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0002-6752-017X
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0002-4390-411X
Blekinge Institute of Technology, Faculty of Engineering, Department of Health.ORCID iD: 0000-0001-9870-8477
Show others and affiliations
2024 (English)In: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 156, article id 102953Article in journal (Refereed) Published
Abstract [en]

Background

Chronic obstructive pulmonary disease (COPD) is a severe condition affecting millions worldwide, leading to numerous annual deaths. The absence of significant symptoms in its early stages promotes high underdiagnosis rates for the affected people. Besides pulmonary function failure, another harmful problem of COPD is the systematical effects, e.g., heart failure or voice distortion. However, the systematic effects of COPD might provide valuable information for early detection. In other words, symptoms caused by systematic effects could be helpful to detect the condition in its early stages.

Objective

The proposed study aims to explore whether the voice features extracted from the vowel “a” utterance carry any information that can be predictive of COPD by employing Machine Learning (ML) on a newly collected voice dataset.

Methods

Forty-eight participants were recruited from the pool of research clinic visitors at Blekinge Institute of Technology (BTH) in Sweden between January 2022 and May 2023. A dataset consisting of 1246 recordings from 48 participants was gathered. The collection of voice recordings containing the vowel “a” utterance commenced following an information and consent meeting with each participant using the VoiceDiagnostic application. The collected voice data was subjected to silence segment removal, feature extraction of baseline acoustic features, and Mel Frequency Cepstrum Coefficients (MFCC). Sociodemographic data was also collected from the participants. Three ML models were investigated for the binary classification of COPD and healthy controls: Random Forest (RF), Support Vector Machine (SVM), and CatBoost (CB). A nested k-fold cross-validation approach was employed. Additionally, the hyperparameters were optimized using grid-search on each ML model. For best performance assessment, accuracy, F1-score, precision, and recall metrics were computed. Afterward, we further examined the best classifier by utilizing the Area Under the Curve (AUC), Average Precision (AP), and SHapley Additive exPlanations (SHAP) feature-importance measures.

Results

The classifiers RF, SVM, and CB achieved a maximum accuracy of 77 %, 69 %, and 78 % on the test set and 93 %, 78 % and 97 % on the validation set, respectively. The CB classifier outperformed RF and SVM. After further investigation of the best-performing classifier, CB demonstrated the highest performance, producing an AUC of 82 % and AP of 76 %. In addition to age and gender, the mean values of baseline acoustic and MFCC features demonstrate high importance and deterministic characteristics for classification performance in both test and validation sets, though in varied order.

Conclusion

This study concludes that the utterance of vowel “a” recordings contain information that can be captured by the CatBoost classifier with high accuracy for the classification of COPD. Additionally, baseline acoustic and MFCC features, in conjunction with age and gender information, can be employed for classification purposes and benefit healthcare for decision support in COPD diagnosis.

Place, publisher, year, edition, pages
Elsevier, 2024. Vol. 156, article id 102953
Keywords [en]
Acoustic features, Signal Processing, Automated classification, Chronic obstructive pulmonary disease, Machine Learning
National Category
Respiratory Medicine and Allergy Signal Processing
Research subject
Applied Health Technology; Applied Signal Processing
Identifiers
URN: urn:nbn:se:bth-26835DOI: 10.1016/j.artmed.2024.102953ISI: 001358362000001Scopus ID: 2-s2.0-85202537741OAI: oai:DiVA.org:bth-26835DiVA, id: diva2:1890597
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsAvailable from: 2024-08-20 Created: 2024-08-20 Last updated: 2025-10-28Bibliographically approved
In thesis
1. Voice for Decision Support in Healthcare Applied to Chronic Obstructive Pulmonary Disease Classification: A Machine Learning Approach
Open this publication in new window or tab >>Voice for Decision Support in Healthcare Applied to Chronic Obstructive Pulmonary Disease Classification: A Machine Learning Approach
2024 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Background: Advancements in machine learning (ML) techniques and voice technology offer the potential to harness voice as a new tool for developing decision-support tools in healthcare for the benefit of both healthcare providers and patients. Motivated by technological breakthroughs and the increasing integration of Artificial Intelligence (AI) and Machine Learning (ML) in healthcare, numerous studies aim to investigate the diagnostic potential of ML algorithms in the context of voice-affecting disorders. This thesis focuses on respiratory diseases such as Chronic Obstructive Pulmonary Disease (COPD) and explores the potential of a decision support tool that utilizes voice and ML. This exploration exemplifies the intricate relationship between voice and overall health through the lens of applied health technology (AHT. This interdisciplinary nature of research recognizes the need for accurate and efficient diagnostic tools.

Objective: The objectives of this licentiate thesis are twofold. Firstly, a Systematic Literature Review (SLR) thoroughly investigates the current state of ML algorithms in detecting voice-affecting disorders, pinpointing existing gaps and suggesting directions for future research. Secondly, the study focuses on respiratory health, specifically COPD, employing ML techniques with a distinct emphasis on the vowel "A". The aim is to explore hidden information that could potentially be utilized for the binary classification of COPD vs no COPD. The creation of a new Swedish COPD voice classification dataset is anticipated to enhance the experimental and exploratory dimensions of the research.

Methods: In order to have a holistic view of a research field, one of the commonly utilized methods is to scan and analyze the literature. Therefore, Paper I followed the methodology of an SLR where existing journal publications were scanned and synthesized to create a holistic view in the realm of ML techniques employed to experiment on voice-affecting disorders. Based on the results from the SLR, Paper II focused on the data collection and experimentation for the binary classification of COPD, which was one of the gaps identified in the first study. Three distinct ML algorithms were investigated on the collected datasets through voice features, which consisted of recordings collected through a mobile application from participants 18 years old and above, and the most utilized performance measures were computed for the best outcome. 

Results: The summary of findings from Paper I reveals the dominance of Support Vector Machine (SVM) classifiers in voice disorder research, with Parkinson's Disease and Alzheimer's Disease as the most studied disorders. Gaps in research include underrepresented disorders, limited datasets in terms of number of participants, and a lack of interest in longitudinal studies. Paper II demonstrates promising results in COPD classification using ML and a newly developed dataset, offering insights into potential decision support tools for COPD diagnosis.

Conclusion: The studies covered in this dissertation provide a comprehensive literature summary of ML techniques used to support decision-making on voice-affecting disorders for clinical outcomes. The findings contribute to understanding the diagnostic potential of using ML on vocal features and highlight avenues for future research and technology development. Nonetheless, the experiment reveals the potential of employing voice as a digital biomarker for COPD diagnosis using ML.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2024. p. 103
Series
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 2024:03
Keywords
Automated decision-support, Classification, Machine Learning, Voice-affecting disorders, Voice dataset, Voice Features, Chronic Obstructive pulmonary disease (COPD)
National Category
Medical and Health Sciences Information Systems
Research subject
Applied Health Technology
Identifiers
urn:nbn:se:bth-25978 (URN)978-91-7295-476-2 (ISBN)
Presentation
2024-05-23, J1630, Valhallavägen 1, Karlskrona, 10:00 (English)
Opponent
Supervisors
Available from: 2024-02-19 Created: 2024-02-19 Last updated: 2025-10-28Bibliographically approved
2. Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
Open this publication in new window or tab >>Voice as a Digital Biomarker: Machine Learning Applications for Chronic Obstructive Pulmonary Disease Assessment
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity and mortality worldwide, with high underdiagnosis rates due to limitations in current diagnostic methods such as spirometry. This doctoral thesis explores the potential of voice as a digital biomarker to support the assessment of COPD, guided by the principles of Applied Health Technology (AHT), which emphasizes interdisciplinary collaboration and real-world applicability.

The research includes four interconnected studies. Study I presents a systematic literature review of machine learning (ML) applications for voice-affecting disorders, identifying COPD as underrepresented in current research. Study II addresses this gap by collecting a new dataset of vowel [a:] recordings from Swedish-speaking COPD patients and healthy controls once a week in self-determined quiet settings. Voice features, including baseline acoustic (BLA) parameters and Mel-Frequency Cepstral Coefficients (MFCCs), were extracted and used to train three ML classifiers: CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). CB demonstrated the highest test accuracy at 78%. 

Study III investigates the effects of signal segmentation on model performance and shows that certain temporal segments of voice recordings contain more informative patterns, enhancing classification outcomes by increasing accuracy to 85%. Study IV applies statistical and practical significance tests to compare voice features between COPD and healthy groups. A total of 34 features, including shimmer measures and higher-order MFCC derivatives, were found to meaningfully differentiate the groups. 

This thesis reframes the human voice as a source of clinically relevant data, demonstrating how it can be digitized, analyzed, and interpreted using ML to aid COPD assessment. The results indicate that voice-based analysis can provide an accessible, non-invasive, and scalable complement to existing diagnostic tools. By integrating technical, clinical, and ethical perspectives, the thesis contributes new knowledge and practical methodologies that align with AHT's goal of creating value-driven, user-centered healthcare solutions. The findings support future development of mobile and remote voice-based screening tools for COPD and other conditions.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2025. p. 160
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:07
Keywords
Chronic Obstructive Pulmonary Disease, Machine Learning, Noninvasive Diagnostic, Segmentation, Voice-based Analysis
National Category
Respiratory Medicine and Allergy Medical and Health Sciences Medical Informatics
Research subject
Applied Health Technology
Identifiers
urn:nbn:se:bth-28038 (URN)978-91-7295-503-5 (ISBN)
Public defence
2025-10-15, J1630, Valhallavägen 1, Karlskrona, 13:00 (English)
Opponent
Supervisors
Available from: 2025-08-11 Created: 2025-06-10 Last updated: 2025-10-28Bibliographically approved

Open Access in DiVA

fulltext(9855 kB)204 downloads
File information
File name FULLTEXT01.pdfFile size 9855 kBChecksum SHA-512
b05f0bf4749db99b07cdab6c8035547a70c3b4a04b8a57dd0e4c0f04f59ccb5235fdb7b000edd1611672d9d14aa8bdd6e2cc400feb7aea01d12295eb27807616
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan

Search in DiVA

By author/editor
Idrisoglu, AlperDallora Moraes, Ana LuizaCheddad, AbbasAnderberg, PeterSanmartin Berglund, Johan
By organisation
Department of HealthDepartment of Computer Science
In the same journal
Artificial Intelligence in Medicine
Respiratory Medicine and AllergySignal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 204 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 526 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf