Document Type

Conference Proceeding

Publication Date

1-16-2006

DOI

http://dx.doi.org/10.1117/12.641229

Abstract

Generally speaking optical character recognition algorithms tend to perform better when presented with homogeneous data. This paper studies a method that is designed to increase the homogeneity of training data, based on an understanding of the types of degradations that occur during the printing and scanning process, and how these degradations affect the homogeneity of the data. While it has been shown that dividing the degradation space by edge spread improves recognition accuracy over dividing the degradation space by threshold or point spread function width alone, the challenge is in deciding how many partitions and at what value of edge spread the divisions should be made. Clustering of different types of character features, fonts, sizes, resolutions and noise levels shows that edge spread is indeed shown to be a strong indicator of the homogeneity of character data clusters.

Copyright Statement

Copyright 2006 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. DOI: 10.1117/12.641229

Publication Information

Barney Smith, Elisa H. and Andersen, Tim. (2006). "Partitioning of the Degradation Space for OCR Training". Proceedings of SPIE-IS&T Electronic Imaging, 6067.

Download

Included in

Electrical and Computer Engineering Commons

COinS

ScholarWorks

Electrical and Computer Engineering Faculty Publications and Presentations

Partitioning of the Degradation Space for OCR Training

Document Type

Publication Date

DOI

Abstract

Copyright Statement

Publication Information

Included in

Browse

Links

Search

Author Corner

ScholarWorks

Electrical and Computer Engineering Faculty Publications and Presentations

Partitioning of the Degradation Space for OCR Training

Authors

Document Type

Publication Date

DOI

Abstract

Copyright Statement

Publication Information

Included in

Share

Browse

Links

Search

Author Corner