Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Voice for Decision Support in Healthcare Applied to Chronic Obstructive Pulmonary Disease Classification: A Machine Learning Approach
Blekinge Tekniska Högskola, Fakulteten för teknikvetenskaper, Institutionen för hälsa. Blekinge Institute of Technology.ORCID-id: 0000-0003-1558-2309
2024 (engelsk)Licentiatavhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Background: Advancements in machine learning (ML) techniques and voice technology offer the potential to harness voice as a new tool for developing decision-support tools in healthcare for the benefit of both healthcare providers and patients. Motivated by technological breakthroughs and the increasing integration of Artificial Intelligence (AI) and Machine Learning (ML) in healthcare, numerous studies aim to investigate the diagnostic potential of ML algorithms in the context of voice-affecting disorders. This thesis focuses on respiratory diseases such as Chronic Obstructive Pulmonary Disease (COPD) and explores the potential of a decision support tool that utilizes voice and ML. This exploration exemplifies the intricate relationship between voice and overall health through the lens of applied health technology (AHT. This interdisciplinary nature of research recognizes the need for accurate and efficient diagnostic tools.

Objective: The objectives of this licentiate thesis are twofold. Firstly, a Systematic Literature Review (SLR) thoroughly investigates the current state of ML algorithms in detecting voice-affecting disorders, pinpointing existing gaps and suggesting directions for future research. Secondly, the study focuses on respiratory health, specifically COPD, employing ML techniques with a distinct emphasis on the vowel "A". The aim is to explore hidden information that could potentially be utilized for the binary classification of COPD vs no COPD. The creation of a new Swedish COPD voice classification dataset is anticipated to enhance the experimental and exploratory dimensions of the research.

Methods: In order to have a holistic view of a research field, one of the commonly utilized methods is to scan and analyze the literature. Therefore, Paper I followed the methodology of an SLR where existing journal publications were scanned and synthesized to create a holistic view in the realm of ML techniques employed to experiment on voice-affecting disorders. Based on the results from the SLR, Paper II focused on the data collection and experimentation for the binary classification of COPD, which was one of the gaps identified in the first study. Three distinct ML algorithms were investigated on the collected datasets through voice features, which consisted of recordings collected through a mobile application from participants 18 years old and above, and the most utilized performance measures were computed for the best outcome. 

Results: The summary of findings from Paper I reveals the dominance of Support Vector Machine (SVM) classifiers in voice disorder research, with Parkinson's Disease and Alzheimer's Disease as the most studied disorders. Gaps in research include underrepresented disorders, limited datasets in terms of number of participants, and a lack of interest in longitudinal studies. Paper II demonstrates promising results in COPD classification using ML and a newly developed dataset, offering insights into potential decision support tools for COPD diagnosis.

Conclusion: The studies covered in this dissertation provide a comprehensive literature summary of ML techniques used to support decision-making on voice-affecting disorders for clinical outcomes. The findings contribute to understanding the diagnostic potential of using ML on vocal features and highlight avenues for future research and technology development. Nonetheless, the experiment reveals the potential of employing voice as a digital biomarker for COPD diagnosis using ML.

sted, utgiver, år, opplag, sider
Karlskrona: Blekinge Tekniska Högskola, 2024. , s. 103
Serie
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 2024:03
Emneord [en]
Automated decision-support, Classification, Machine Learning, Voice-affecting disorders, Voice dataset, Voice Features, Chronic Obstructive pulmonary disease (COPD)
HSV kategori
Forskningsprogram
Tillämpad hälsoteknik
Identifikatorer
URN: urn:nbn:se:bth-25978ISBN: 978-91-7295-476-2 (tryckt)OAI: oai:DiVA.org:bth-25978DiVA, id: diva2:1838825
Presentation
2024-05-23, J1630, Valhallavägen 1, Karlskrona, 10:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2024-02-19 Laget: 2024-02-19 Sist oppdatert: 2025-10-28bibliografisk kontrollert
Delarbeid
1. Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review
Åpne denne publikasjonen i ny fane eller vindu >>Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review
2023 (engelsk)Inngår i: Journal of Medical Internet Research, E-ISSN 1438-8871, Vol. 25, artikkel-id e46105Artikkel, forskningsoversikt (Fagfellevurdert) Published
Abstract [en]

Background: Normal voice production depends on the synchronized cooperation of multiple physiological systems, which makes the voice sensitive to changes. Any systematic, neurological, and aerodigestive distortion is prone to affect voice production through reduced cognitive, pulmonary, and muscular functionality. This sensitivity inspired using voice as a biomarker to examine disorders that affect the voice. Technological improvements and emerging machine learning (ML) technologies have enabled possibilities of extracting digital vocal features from the voice for automated diagnosis and monitoring systems. Objective: This study aims to summarize a comprehensive view of research on voice-affecting disorders that uses ML techniques for diagnosis and monitoring through voice samples where systematic conditions, nonlaryngeal aerodigestive disorders, and neurological disorders are specifically of interest. Methods: This systematic literature review (SLR) investigated the state of the art of voice-based diagnostic and monitoring systems with ML technologies, targeting voice-affecting disorders without direct relation to the voice box from the point of view of applied health technology. Through a comprehensive search string, studies published from 2012 to 2022 from the databases Scopus, PubMed, and Web of Science were scanned and collected for assessment. To minimize bias, retrieval of the relevant references in other studies in the field was ensured, and 2 authors assessed the collected studies. Low-quality studies were removed through a quality assessment and relevant data were extracted through summary tables for analysis. The articles were checked for similarities between author groups to prevent cumulative redundancy bias during the screening process, where only 1 article was included from the same author group. Results: In the analysis of the 145 included studies, support vector machines were the most utilized ML technique (51/145, 35.2%), with the most studied disease being Parkinson disease (PD; reported in 87/145, 60%, studies). After 2017, 16 additional voice-affecting disorders were examined, in contrast to the 3 investigated previously. Furthermore, an upsurge in the use of artificial neural network-based architectures was observed after 2017. Almost half of the included studies were published in last 2 years (2021 and 2022). A broad interest from many countries was observed. Notably, nearly one-half (n=75) of the studies relied on 10 distinct data sets, and 11/145 (7.6%) used demographic data as an input for ML models. Conclusions: This SLR revealed considerable interest across multiple countries in using ML techniques for diagnosing and monitoring voice-affecting disorders, with PD being the most studied disorder. However, the review identified several gaps, including limited and unbalanced data set usage in studies, and a focus on diagnostic test rather than disorder-specific monitoring. Despite the limitations of being constrained by only peer-reviewed publications written in English, the SLR provides valuable insights into the current state of research on ML-based voice-affecting disorder diagnosis and monitoring and highlighting areas to address in future research. © 2023 Journal of Medical Internet Research. All rights reserved.

sted, utgiver, år, opplag, sider
JMIR Publications, 2023
Emneord
diagnosis, digital biomarkers, machine learning, monitoring, voice features, voice-affecting disorder, Humans, Monitoring, Physiologic, human, physiologic monitoring
HSV kategori
Identifikatorer
urn:nbn:se:bth-25253 (URN)10.2196/46105 (DOI)001048954300007 ()2-s2.0-85165520794 (Scopus ID)
Tilgjengelig fra: 2023-08-09 Laget: 2023-08-09 Sist oppdatert: 2025-10-28bibliografisk kontrollert
2. COPDVD: Automated classification of chronic obstructive pulmonary disease on a new collected and evaluated voice dataset
Åpne denne publikasjonen i ny fane eller vindu >>COPDVD: Automated classification of chronic obstructive pulmonary disease on a new collected and evaluated voice dataset
Vise andre…
2024 (engelsk)Inngår i: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 156, artikkel-id 102953Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Background

Chronic obstructive pulmonary disease (COPD) is a severe condition affecting millions worldwide, leading to numerous annual deaths. The absence of significant symptoms in its early stages promotes high underdiagnosis rates for the affected people. Besides pulmonary function failure, another harmful problem of COPD is the systematical effects, e.g., heart failure or voice distortion. However, the systematic effects of COPD might provide valuable information for early detection. In other words, symptoms caused by systematic effects could be helpful to detect the condition in its early stages.

Objective

The proposed study aims to explore whether the voice features extracted from the vowel “a” utterance carry any information that can be predictive of COPD by employing Machine Learning (ML) on a newly collected voice dataset.

Methods

Forty-eight participants were recruited from the pool of research clinic visitors at Blekinge Institute of Technology (BTH) in Sweden between January 2022 and May 2023. A dataset consisting of 1246 recordings from 48 participants was gathered. The collection of voice recordings containing the vowel “a” utterance commenced following an information and consent meeting with each participant using the VoiceDiagnostic application. The collected voice data was subjected to silence segment removal, feature extraction of baseline acoustic features, and Mel Frequency Cepstrum Coefficients (MFCC). Sociodemographic data was also collected from the participants. Three ML models were investigated for the binary classification of COPD and healthy controls: Random Forest (RF), Support Vector Machine (SVM), and CatBoost (CB). A nested k-fold cross-validation approach was employed. Additionally, the hyperparameters were optimized using grid-search on each ML model. For best performance assessment, accuracy, F1-score, precision, and recall metrics were computed. Afterward, we further examined the best classifier by utilizing the Area Under the Curve (AUC), Average Precision (AP), and SHapley Additive exPlanations (SHAP) feature-importance measures.

Results

The classifiers RF, SVM, and CB achieved a maximum accuracy of 77 %, 69 %, and 78 % on the test set and 93 %, 78 % and 97 % on the validation set, respectively. The CB classifier outperformed RF and SVM. After further investigation of the best-performing classifier, CB demonstrated the highest performance, producing an AUC of 82 % and AP of 76 %. In addition to age and gender, the mean values of baseline acoustic and MFCC features demonstrate high importance and deterministic characteristics for classification performance in both test and validation sets, though in varied order.

Conclusion

This study concludes that the utterance of vowel “a” recordings contain information that can be captured by the CatBoost classifier with high accuracy for the classification of COPD. Additionally, baseline acoustic and MFCC features, in conjunction with age and gender information, can be employed for classification purposes and benefit healthcare for decision support in COPD diagnosis.

sted, utgiver, år, opplag, sider
Elsevier, 2024
Emneord
Acoustic features, Signal Processing, Automated classification, Chronic obstructive pulmonary disease, Machine Learning
HSV kategori
Forskningsprogram
Tillämpad hälsoteknik; Tillämpad signalbehandling
Identifikatorer
urn:nbn:se:bth-26835 (URN)10.1016/j.artmed.2024.102953 (DOI)001358362000001 ()2-s2.0-85202537741 (Scopus ID)
Forskningsfinansiär
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Tilgjengelig fra: 2024-08-20 Laget: 2024-08-20 Sist oppdatert: 2025-10-28bibliografisk kontrollert

Open Access i DiVA

fulltext(5327 kB)698 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 5327 kBChecksum SHA-512
8311463a79151ce42a48fe794d9b0fec0d6a48947902b262997fa1031b95e7f5a165e892c3be058bfc45c78a50426d39fd0739628c090b647cd7df3463d1ce52
Type fulltextMimetype application/pdf

Person

Idrisoglu, Alper

Søk i DiVA

Av forfatter/redaktør
Idrisoglu, Alper
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 698 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1463 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf