Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A hybrid CNN-transformer framework optimized by Grey Wolf Algorithm for accurate sign language recognition
SIMAD University, Somalia.
University Teknologi Malaysia, Malaysia.
Torrens University Australia, Australia.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-4071-4596
Show others and affiliations
2025 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 15, no 1, article id 43550Article in journal (Refereed) Published
Abstract [en]

This paper introduces the Gray Wolf Optimized Convolutional Transformer Network, a combined deep learning framework aimed at accurately and efficiently recognizing dynamic hand gestures, especially in American Sign Language (ASL). The model integrates Convolutional Neural Networks (CNNs) for spatial feature extraction, Transformers for temporal sequence modeling, and Grey Wolf Optimization (GWO) for hyperparameter tuning. Extensive experiments were conducted on two benchmark datasets, ASL Alphabet and ASL MNIST to validate the model’s effectiveness in both static and dynamic sign classification. The proposed model achieved superior performance across all key metrics, including a accuracy of 99.40%, F1-score of 99.31%, Matthews Correlation Coefficient (MCC) of 0.988, and Area Under the Curve (AUC) of 0.992, surpassing existing models such as PCA-IGWO, KPCA-IGWO, GWO-CNN, and AEGWO-NET. Real-time gesture detection outputs further demonstrated the model’s robustness in varied environmental conditions and its applicability in assistive communication technologies. Additionally, the integration of GWO not only accelerated convergence but also enhanced generalization by optimally selecting model configurations. The results show that GWO-CTransNet offers a powerful, scalable solution for vision-based sign language recognition systems, combining high accuracy, fast inference, and adaptability in real-world applications. 

Place, publisher, year, edition, pages
Springer Nature, 2025. Vol. 15, no 1, article id 43550
Keywords [en]
Convolutional neural network, Grey Wolf Optimization, Hand gesture recognition, Hyperparameter optimization, Sign language recognition, algorithm, artificial neural network, deep learning, gesture, human, sign language, Algorithms, Gestures, Humans, Neural Networks, Computer
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-29031DOI: 10.1038/s41598-025-27390-2Scopus ID: 2-s2.0-105024363301OAI: oai:DiVA.org:bth-29031DiVA, id: diva2:2024972
Available from: 2026-01-02 Created: 2026-01-02 Last updated: 2026-01-02Bibliographically approved

Open Access in DiVA

fulltext(3740 kB)47 downloads
File information
File name FULLTEXT01.pdfFile size 3740 kBChecksum SHA-512
53ed672614bd49cdec03a6e2a02d547d62d70ce6ae8cec1776200f978e7ce896668a0ad160341743329a70f6d08af293628947a670dddb7a77e469ea73769b5e
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Kebande, Victor R.

Search in DiVA

By author/editor
Kebande, Victor R.
By organisation
Department of Computer Science
In the same journal
Scientific Reports
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1212 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf