Publication Date


Date of Final Oral Examination (Defense)


Type of Culminating Activity


Degree Title

Master of Science in Computer Science


Computer Science

Major Advisor

Tim Andersen, PhD


Edoardo Serra, PhD


William L. Hughes, PhD


Reza Zadegan, PhD


Our increasingly information driven world is growing the demand for new storage technologies. Current estimates place the total storage demands exceeding the supply of usable silicon by 2040 [1]. DNA is an attractive technology due to its incredible density, almost negligible energy requirements, and data retention measured in centuries [1]. DNA does, however, come with new challenges. It is an organic compound with complex internal interactions which complicate the design and synthesis of DNA sequences for the purpose of data storage. In this work we demonstrate a new encoding-decoding process that accounts for some of the challenges in encoding and decoding, including issues arising from the secondary structure of the sequence, repeated nucleotides, unwanted subsequences, as well as GC content, vital for ensuring stable sequences. This is accomplished by using a graph representation of the possible encoding space that captures the relevant constraints, combined with a search algorithm that identifies the optimal encoding for the given input data accounting for these constraints. A benefit of our approach is that by leveraging the constraints on the encoding process, the decoding algorithm is able to correct single point errors without the aid of error correction codes; this is something no current competing solution can accomplish.