System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ARDIS: A Swedish Historical Handwritten Digit Dataset
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering. (Big Data)ORCID iD: 0000-0001-7536-3349
KTO Karatay University, TUR. (Department of Mechatronics Engineering)
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering. (Big Data)ORCID iD: 0000-0002-4390-411x
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0001-9947-1088
Show others and affiliations
2020 (English)In: Neural Computing & Applications, ISSN 0941-0643, E-ISSN 1433-3058, Vol. 32, no 21, p. 16505-16518, article id Special issueArticle in journal (Refereed) Published
Abstract [en]

This paper introduces a new image-based handwrittenhistorical digit dataset named ARDIS (Arkiv DigitalSweden). The images in ARDIS dataset are extractedfrom 15,000 Swedish church records which were writtenby different priests with various handwriting styles in thenineteenth and twentieth centuries. The constructed datasetconsists of three single digit datasets and one digit stringsdataset. The digit strings dataset includes 10,000 samplesin Red-Green-Blue (RGB) color space, whereas, the otherdatasets contain 7,600 single digit images in different colorspaces. An extensive analysis of machine learning methodson several digit datasets is examined. Additionally, correlationbetween ARDIS and existing digit datasets ModifiedNational Institute of Standards and Technology (MNIST)and United States Postal Service (USPS) is investigated. Experimental results show that machine learning algorithms,including deep learning methods, provide low recognitionaccuracy as they face difficulties when trained on existingdatasets and tested on ARDIS dataset. Accordingly, ConvolutionalNeural Network (CNN) trained on MNIST andUSPS and tested on ARDIS provide the highest accuracies 58.80% and 35.44%, respectively. Consequently, the resultsreveal that machine learning methods trained on existingdatasets can have difficulties to recognize digits effectivelyon our dataset which proves that ARDIS dataset hasunique characteristics. This dataset is publicly available forthe research community to further advance handwritten digitrecognition algorithms.

Place, publisher, year, edition, pages
Springer Nature Switzerland , 2020. Vol. 32, no 21, p. 16505-16518, article id Special issue
Keywords [en]
Handwritten digit recognition, ARDIS dataset, Machine learning methods, Benchmark
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:bth-17741DOI: 10.1007/s00521-019-04163-3ISI: 000577694800012OAI: oai:DiVA.org:bth-17741DiVA, id: diva2:1299608
Part of project
Bigdata@BTH- Scalable resource-efficient systems for big data analytics, Knowledge Foundation
Funder
Knowledge Foundation, 20140032
Note

open access

Available from: 2019-03-27 Created: 2019-03-27 Last updated: 2022-05-04Bibliographically approved

Open Access in DiVA

fulltext(1189 kB)865 downloads
File information
File name FULLTEXT01.pdfFile size 1189 kBChecksum SHA-512
dc034f71460da06733d19e6b41d25a873189ba0f750cc6b9afec942fad8415008996ddeb12458c353b8f17e02487d544151b4f7d0c0677afb53bd28cccf3dae5
Type fulltextMimetype application/pdf

Other links

Publisher's full texthttps://link.springer.com/article/10.1007/s00521-019-04163-3

Authority records

Kusetogullari, HüseyinCheddad, AbbasGrahn, Håkan

Search in DiVA

By author/editor
Kusetogullari, HüseyinCheddad, AbbasGrahn, Håkan
By organisation
Department of Computer Science and Engineering
In the same journal
Neural Computing & Applications
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 865 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 961 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf