A Comparison of Advanced DeepLearning Algorithms for Multi-digit Detection in Historical Documents
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Background: Historical handwritten documents are assets for future generations and should be appropriately secured, so handwritten digit detection plays an important role to preserve them. Handwritten digit detection is a fundamental problem and has been studied extensively for many years. While earlier methods were effective to some extent, they often required domain expertise and extensive parameter tuning, making them time-consuming and difficult to generalize to new data. The development of deep learning techniques has led to significant improvements in handwritten digit recognition. They can automatically learn relevant features from raw image data, making them more robust to variations in handwriting styles.
Objectives: This study first considers a few deep learning algorithms for detecting the digits, considering the different challenges of the handwritten digits, and then finds the best algorithm among them using metrics to know which is the best-performing deep learning model.
Methods: Literature Review and experimentation are the research methodologies employed in this study. We have chosen four advanced deep-learning methods(YOLOV5, Faster R-CNN, RetinaNet, YOLOV7) to identify handwritten digits in digit string images. Each method is trained and tested using the DIDA dataset. Performance evaluation is conducted to determine the best method based on all analyses from experiments on all selected DL methods.
Results: The augmented and digit string datasets are used for training and testing the deep learning models. The chosen models are evaluated using metrics for an efficient model. The results from the experimental evaluation show the best deep learning model among the selected models for detecting multi-digit strings in historical handwritten digit images.
Conclusions: Results obtained from the performance metrics of the respective algorithm justify that the YOLOv7 algorithm has more efficiency and accuracy compared to YOLOv5, RetinaNet, and Faster R-CNN for the detection of handwritten digits in historical document images.
Place, publisher, year, edition, pages
2023.
Keywords [en]
Deep Learning Methods, Handwritten digits, Digit Detection, Image processing on document images, YOLOV5, Faster R-CNN, RetinaNet, YOLOV7
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-24323OAI: oai:DiVA.org:bth-24323DiVA, id: diva2:1740872
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVADA Master Qualification Plan in Computer Science
Presentation
2023-01-23, Karlskrona, 13:00 (English)
Supervisors
Examiners
2023-03-022023-03-022023-03-02Bibliographically approved