This paper presents a publicly accessible Bangla offline handwriting dataset, as well as benchmarking with a simple and robust isolated handwritten character recognition scheme. The dataset is named XXX Bangla Handwriting Dataset. The dataset contains 2 pages. The first has a 104 word/364 character essay. The essay uses 49 basic characters, all 11 vowel diacritics and 32 high frequency consonant conjuncts. The second page contains 84 isolated units containing all basic characters, numbers, vowel diacritics and several high frequency conjuncts. The initial release is based on the voluntary contribution of 100 different writers. One of the highlights and unique features of this database is that all of its contents are tagged with the associated ground truth information from different component hierarchies, such as characters, words and lines. It is expected to be useful for research on offline Bangla handwriting recognition, particularly with segmentation-based approaches. Furthermore, a basic character recognition method is presented where the features are extracted based on zonal pixel counts, structural strokes and grid points with U-SURF descriptors modeled with bag of features. The highest classification accuracy obtained with an SVM classifier based on a cubic kernel is 95.4% using the isolated characters from the XXX dataset together with 3 other datasets to ensure the versatility and robustness of this process.
© 2018, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. doi: 10.1109/ICFHR-2018.2018.00073
Majid, Nishatul and Barney Smith, Elisa H.. (2018). "Introducing the XXX Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters". Proceedings: 2018 16th International Conference on Frontiers in Handwriting Recognition: ICFHR 2018, 380-385.