Publication Date

8-2017

Date of Final Oral Examination (Defense)

5-8-2017

Type of Culminating Activity

Thesis

Degree Title

Master of Science in Electrical Engineering

Department

Electrical and Computer Engineering

Major Advisor

Elisa H. Barney Smith, Ph.D.

Advisor

Steven Olsen-Smith, Ph.D.

Advisor

Said Ahmed-Zaid, Ph.D.

Abstract

We examine the feasibility of using image processing techniques to determine differentiation in authorship of historical pencil marks. Pencil marks with unattributed and attributed authorship are segmented from digital images of historical books. Analysis is performed on five features that are extracted from the "vertical" pencil marks, with those features used as a basis for authorship of marks. These marks consist of single stroke marks that are interspersed in the same document. We describe the challenges of the digital format that we were given and the steps taken in using autonomous segmentation to save pixel locations of marks. Five mark features are chosen and extracted: Average Intensity, Stroke Width, Blurriness, Stroke Curvature, and Stroke Angle. Features are then analyzed with the use of different histograms, 2D scatter plots of feature space, and comparing and contrasting the two groups of marks. C-means clustering is performed on the feature spaces of both groups. Semi-supervised clustering is used to test if we can predict the clustering. We then use two forms of cluster validity, Davies-Bouldin Index and Silhouette, in order to v produce a confidence value on the number of clusters and their membership. Then we look at the histograms and 2D scatter plots with the Melville’s Marginalia Online attributed and unattributed labels applied. Extracting features show patterns and trends within the marks that could be used to group marks. Specifically, Stroke Curvature became a dominant feature that showed promises of differentiating marks created by different authors. Extracting features has the potential to be used with high confidence in separating marks by author.

DOI

https://doi.org/10.18122/B2QQ6N

Files over 30MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS