Document Type
Article
Publication Date
12-2022
Abstract
This paper demonstrates a framework for offline handwriting recognition using character spotting and autonomous tagging which works for any alphabetic script. Character spotting builds on the idea of object detection to find character elements in unsegmented word images. An autonomous tagging approach is introduced which automates the production of a character image training set by estimating character locations in a word based on typical character size. Although scripts can vary vividly from each other, our proposed approach provides a simple and powerful workflow for unconstrained offline recognition that should work for any alphabetic script with few adjustments. Here we demonstrate this approach with handwritten Bangla, obtaining a character recognition accuracy (CRA) of 94.8% and 91.12% with precision and autonomous tagging, respectively. Furthermore, we explained how character spotting and autonomous tagging can be implemented for other alphabetic scripts. We demonstrated that with handwritten Hangul/Korean obtaining a Jamo recognition accuracy (JRA) of 93.16% using a tiny fraction of the PE92 training set. The combination of character spotting and autonomous tagging takes away one of the biggest frustrations—data annotation by hand, and thus, we believe this has the potential to revolutionize the growth of offline recognition development.
Copyright Statement
This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s10032-022-00410-x
Publication Information
Majid, Nishatul and Barney Smith, Elisa H.. (2022). "Character Spotting and Autonomous Tagging: Offline Handwriting Recognition for Bangla, Korean and Other Alphabetic Scripts". International Journal on Document Analysis and Recognition, 25(4), 245-263. https://doi.org/10.1007/s10032-022-00410-x
Comments
Related Dataset
Majid, Nishatul and Barney Smith, Elisa H. (2018). "Boise State Bangla Handwriting Dataset". [Data set]. Signal and Image Processing Lab, 1. https://scholarworks.boisestate.edu/saipl/1