Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
User Feedback and Uncertainty in User Guided Binarization
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0002-2161-7371
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0001-9947-1088
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0002-0535-1761
2018 (English)In: International Conference on Data Mining Workshops / [ed] Tong, H; Li, Z; Zhu, F; Yu, J, IEEE Computer Society, 2018, p. 403-410, article id 8637367Conference paper, Published paper (Refereed)
Abstract [en]

In a child’s development, the child’s inherent ability to construct knowledge from new information is as important as explicit instructional guidance. Similarly, mechanisms to produce suitable learning representations, which can be trans- ferred and allow integration of new information are important for artificial learning systems. However, equally important are modes of instructional guidance, which allow the system to learn efficiently. Thus, the challenge for efficient learning is to identify suitable guidance strategies together with suitable learning mechanisms.

In this paper, we propose guided machine learning as source for suitable guidance strategies, we distinguish be- tween sample selection based and privileged information based strategies and evaluate three sample selection based strategies on a simple transfer learning task. The evaluated strategies are random sample selection, i.e., supervised learning, user based sample selection based on readability, and user based sample selection based on readability and uncertainty. We show that sampling based on readability and uncertainty tends to produce better learning results than the other two strategies. Furthermore, we evaluate the use of the learner’s uncertainty for self directed learning and find that effects similar to the Dunning-Kruger effect prevent this use case. The learning task in this study is document image binarization, i.e., the separation of text foreground from page background and the source domain of the transfer are texts written on paper in Latin characters, while the target domain are texts written on palm leaves in Balinese script.

Place, publisher, year, edition, pages
IEEE Computer Society, 2018. p. 403-410, article id 8637367
Keywords [en]
guided machine learning, interactive machine learning, image binarization, historical documents
National Category
Computer Vision and Robotics (Autonomous Systems) Human Computer Interaction
Identifiers
URN: urn:nbn:se:bth-17742DOI: 10.1109/ICDMW.2018.00066ISI: 000465766800058ISBN: 978-1-5386-9288-2 (electronic)OAI: oai:DiVA.org:bth-17742DiVA, id: diva2:1299611
Conference
18th IEEE International Conference on Data Mining Workshops, ICDMW, Singapore; Singapore; 17 November 2018 through 20 November
Funder
Knowledge Foundation, 20140032
Note

 "© 20XX IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Available from: 2019-03-27 Created: 2019-03-27 Last updated: 2020-05-29Bibliographically approved
In thesis
1. Data and Time Efficient Historical Document Analysis
Open this publication in new window or tab >>Data and Time Efficient Historical Document Analysis
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over the last decades companies and government institutions have gathered vast collections of images of historical handwritten documents. In order to make these collections truly useful to the broader public, images suffering from degradations, such as faded ink, bleed through or stains, need to be made readable and the collections as a whole need to be made searchable. Readability can be achieved by separating text foreground from page background using document image binarization, while searchability by search string or by example image can be achieved through word spotting. Developing algorithms with reasonable binarization or word spotting performance is a difficult task. Additional challenges are to make these algorithms execute fast enough to process vast collections of images in a reasonable amount of time, and to enable them to learn from few labeled training samples. In this thesis, we explore heterogeneous computing, parameter prediction, and enhanced throughput as ways to reduce the execution time of document image binarization algorithms. We find that parameter prediction and mapping a heuristics based binarization algorithm to the GPU lead to an 1.7 and 3.5 increase in execution performance respectively. Furthermore, we identify for a learning based binarization algorithm using recurrent neural networks the number of pixels processed at once as way to trade off execution time with binarization quality. The achieved increase in throughput results in a 3.8 times faster overall execution time. Additionally, we explore guided machine learning (gML) as a possible approach to reduce the required amount of training data for learning based algorithms for binarization, character recognition and word spotting. We propose an initial gML system for binarization, which allows a user to improve an algorithm’s binarization quality by selecting suitable training samples. Based on this system, we identify and pursue three different directions, viz., formulation of a clear definition of gML, identification of an efficient knowledge transfer mechanism from user to learner, and automation of sample selection. We explore the Learning Using Privileged Information paradigm as a possible knowledge transfer mechanism by using character graphs as privileged information for training a neural network based character recognizer. Furthermore, we show that, given a suitable word image representation, automatic sample selection can help to reduce the amount of training data required for word spotting by up to 69%.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2020. p. 202
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 5
National Category
Computer Engineering Computer Sciences Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:bth-19529 (URN)978-91-7295-404-5 (ISBN)
Public defence
2020-09-03, J1630, Valhallavägen 1, Karlskrona, 13:15 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 20140032
Available from: 2020-05-29 Created: 2020-05-29 Last updated: 2020-05-29Bibliographically approved

Open Access in DiVA

fulltext(599 kB)76 downloads
File information
File name FULLTEXT01.pdfFile size 599 kBChecksum SHA-512
280ec5d97b7cb221b375fb2087b4127038d6365a6947b8acc3a58d175c7f34c7104040968c60c0dd3e3eb8438e761f763a34a0bce8b86ac00b794adef5c38fd2
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Westphal, FlorianGrahn, HåkanLavesson, Niklas

Search in DiVA

By author/editor
Westphal, FlorianGrahn, HåkanLavesson, Niklas
By organisation
Department of Computer Science and Engineering
Computer Vision and Robotics (Autonomous Systems)Human Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar
Total: 76 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 157 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf