Modeling Image Degradations for Improving OCR
Clean documents are relatively easy to recognize. However, when digitizing collections of documents, the clean ones are rarely the documents that are encountered. The processes of printing and scanning documents introduce image degradations that interfere with the segmentation and recognition processes. Mathematical models of the degradation processes are presented. From these the types of degradations that are seen can be quantitatively and qualitatively described. Included in the discussion are sampling, edge spread, corner erosion, and edge noise. The relationship between these degradations and common OCR errors is described. By considering the degradation model, a theoretical foundation is available to improve the document recognition process.
Barney Smith, Elisa. (2008). "Modeling Image Degradations for Improving OCR". 16th European Signal Processing Conference (EUSIPCO 2008), .