Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Efficient document image binarization using heterogeneous computing and parameter tuning
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0002-2161-7371
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0001-9947-1088
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0002-0535-1761
2018 (engelsk)Inngår i: International Journal on Document Analysis and Recognition, ISSN 1433-2833, E-ISSN 1433-2825, Vol. 21, nr 1-2, s. 41-58Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In the context of historical document analysis, image binarization is a first important step, which separates foreground from background, despite common image degradations, such as faded ink, stains, or bleed-through. Fast binarization has great significance when analyzing vast archives of document images, since even small inefficiencies can quickly accumulate to years of wasted execution time. Therefore, efficient binarization is especially relevant to companies and government institutions, who want to analyze their large collections of document images. The main challenge with this is to speed up the execution performance without affecting the binarization performance. We modify a state-of-the-art binarization algorithm and achieve on average a 3.5 times faster execution performance by correctly mapping this algorithm to a heterogeneous platform, consisting of a CPU and a GPU. Our proposed parameter tuning algorithm additionally improves the execution time for parameter tuning by a factor of 1.7, compared to previous parameter tuning algorithms. We see that for the chosen algorithm, machine learning-based parameter tuning improves the execution performance more than heterogeneous computing, when comparing absolute execution times. © 2018 The Author(s)

sted, utgiver, år, opplag, sider
Springer Verlag , 2018. Vol. 21, nr 1-2, s. 41-58
Emneord [en]
Automatic parameter tuning, Heterogeneous computing, Historical documents, Image binarization, Bins, History, Image analysis, Learning systems, Document image binarization, Government institutions, Heterogeneous platforms, Parameter tuning algorithm, Parameter estimation
HSV kategori
Identifikatorer
URN: urn:nbn:se:bth-15891DOI: 10.1007/s10032-017-0293-7ISI: 000433193500003Scopus ID: 2-s2.0-85041228615OAI: oai:DiVA.org:bth-15891DiVA, id: diva2:1182896
Tilgjengelig fra: 2018-02-15 Laget: 2018-02-15 Sist oppdatert: 2018-08-27bibliografisk kontrollert
Inngår i avhandling
1. Efficient Document Image Binarization using Heterogeneous Computing and Interactive Machine Learning
Åpne denne publikasjonen i ny fane eller vindu >>Efficient Document Image Binarization using Heterogeneous Computing and Interactive Machine Learning
2018 (engelsk)Licentiatavhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Large collections of historical document images have been collected by companies and government institutions for decades. More recently, these collections have been made available to a larger public via the Internet. However, to make accessing them truly useful, the contained images need to be made readable and searchable. One step in that direction is document image binarization, the separation of text foreground from page background. This separation makes the text shown in the document images easier to process by humans and other image processing algorithms alike. While reasonably well working binarization algorithms exist, it is not sufficient to just being able to perform the separation of foreground and background well. This separation also has to be achieved in an efficient manner, in terms of execution time, but also in terms of training data used by machine learning based methods. This is necessary to make binarization not only theoretically possible, but also practically viable.

In this thesis, we explore different ways to achieve efficient binarization in terms of execution time by improving the implementation and the algorithm of a state-of-the-art binarization method. We find that parameter prediction, as well as mapping the algorithm onto the graphics processing unit (GPU) help to improve its execution performance. Furthermore, we propose a binarization algorithm based on recurrent neural networks and evaluate the choice of its design parameters with respect to their impact on execution time and binarization quality. Here, we identify a trade-off between binarization quality and execution performance based on the algorithm’s footprint size and show that dynamically weighted training loss tends to improve the binarization quality. Lastly, we address the problem of training data efficiency by evaluating the use of interactive machine learning for reducing the required amount of training data for our recurrent neural network based method. We show that user feedback can help to achieve better binarization quality with less training data and that visualized uncertainty helps to guide users to give more relevant feedback.

sted, utgiver, år, opplag, sider
Karlskrona: Blekinge Tekniska Högskola, 2018. s. 135
Serie
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 3
Emneord
image binarization, heterogeneous computing, recurrent neural networks, interactive machine learning, historical documents
HSV kategori
Identifikatorer
urn:nbn:se:bth-16797 (URN)978-91-7295-355-0 (ISBN)
Presentation
2018-09-10, J1640, Valhallavägen 1, Karlskrona, 10:15 (engelsk)
Opponent
Veileder
Prosjekter
Scalable resource-efficient systems for big data analytics
Forskningsfinansiär
Knowledge Foundation, 20140032
Tilgjengelig fra: 2018-08-27 Laget: 2018-07-12 Sist oppdatert: 2018-11-14bibliografisk kontrollert

Open Access i DiVA

fulltext(1264 kB)82 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1264 kBChecksum SHA-512
481d903fc9c988715e1cc1a3206ab64c8c6daccfd158c4a71ab8ff5df016e59e80910d6f16b8e7af5b0dc5e93866a852a46b9e803dbbb400a5f58f97dc208a15
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Westphal, FlorianGrahn, HåkanLavesson, Niklas

Søk i DiVA

Av forfatter/redaktør
Westphal, FlorianGrahn, HåkanLavesson, Niklas
Av organisasjonen
I samme tidsskrift
International Journal on Document Analysis and Recognition

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 82 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 216 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf