Document Type

Article

Publication Date

9-2021

Abstract

Handwritten numeral recognition is a challenging research problem because of the enormous varieties of styles in which human beings write the numerals. Several researchers have tried to find solutions to this problem with exceptional recognition accuracies. However, most of these solutions have been dedicated to single script numerals. Such methods are inappropriate for multi-lingual nations such as India where a large number of scripts are used. Keeping this issue in mind, a new feature descriptor named symbolization of binary images (SBI) is introduced here for the recognition of handwritten numerals of different scripts. Effectiveness of SBI is supported with experiments showing its script-invariant nature. Classification of numerals using a multiclass support vector machine (SVM) classifier yields the recognition accuracies of 98.18, 96.22, 96.52, and 95.53% on datasets of numerals written in four popular scripts of the world: Arabic, Bangla, Devanagari, and Latin, respectively. This scheme has also been extended to the situation when the script used is not known a priori or the numerals written in a document belong to pairs of mixed scripts of {Arabic, Devanagari, Bangla} with Latin producing recognition rates of 92.97, 91.25, and 91.67%, respectively. When all four scripts are mixed, the recognition rate is still 90.98% overall. Encouraging outcomes suggest that the proposed SBI feature descriptor can recognize numerals invariant of the script class.

Copyright Statement

This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at Expert Systems, published by John Wiley & Sons Ltd. Copyright restrictions may apply. https://doi.org/10.1111/exsy.12699

Share

COinS