Publication Date


Date of Final Oral Examination (Defense)


Type of Culminating Activity


Degree Title

Doctor of Philosophy in Computing


Computer Science

Major Advisor

Eric J. Hayden Ph.D.


Grady B. Wright Ph.D.


Steven Cutchin Ph.D.


Biomolecules could be engineered to solve many societal challenges, including disease diagnosis and treatment, environmental sustainability, and food security. However, our limited understanding of how mutational variants alter molecular structures and functional performance has constrained the potential of important technological advances, such as high-throughput sequencing and gene editing. Ribonuleic Acid (RNA) sequences are thought to play a central role within many of these challenges. Their continual discovery throughout all domains of life is evidence of their significant biological importance (Weinreb et al., 2016). The self-cleaving ribozyme is a class of noncoding Ribonuleic Acid (ncRNA) that has been useful for relating sequence variants to structural features and their associated catalytic activities. Self-cleaving ribozymes possess tractable sequence spaces, perform easily identifiable catalytic functions, and have well documented structures. The determination of a self-cleaving ribozyme’s structure and catalytic activity within the laboratory is typically a slow and expensive process. Most current explorations of structure and function come from these empirical processes. Computational approaches to the prediction of catalytic activity and structure are fast and inexpensive, but have failed both to achieve atomic accuracy or to correctly identify all base-pair interactions (Watkins et al., 2018). One prominent impediment to computational approaches is the lack of existing structural and functional data typically required by predictive models (Jumper et al., 2021). Using data from deep-mutational scanning experiments and high-throughput sequencing technology, it is possible to computationally map mutational variants to their observed catalytic activity for a range of self-cleaving ribozymes. The resulting map reveals important base-pairing relationships that, in turn, facilitate accurate predictions of higher-order variants. Using sequence data from three experimental replicates of five model self-cleaving ribozymes, I will identify and map all single and double mutation variants to their observed cleavage activity. These mappings will be used to identify structural features within each ribozyme. Next, I will show within a training tool how observed cleavage for multiple reaction times can be used to identify the catalytic rates of our model ribozymes. Finally, I will predict the functional activity for model ribozyme variants of various mutational orders using machine learning models trained only on functionally labeled sequence variants. Together, these three dissertation chapters represent the kind of analysis needed to further the implementation of more accurate structural and functional prediction algorithms.


Files over 30MB may be slow to open. For best results, right-click and select "save as..."