Publication Date

8-2018

Date of Final Oral Examination (Defense)

7-11-2018

Type of Culminating Activity

Dissertation

Degree Title

Doctor of Philosophy in Biomolecular Sciences

Department

Biology

Major Advisor

Eric J. Hayden, Ph.D.

Advisor

Matthew L. Ferguson, Ph.D.

Advisor

Elton Graugnard, Ph.D.

Abstract

Fitness landscapes or adaptive landscapes represent the mapping of genotype (sequence) to phenotype (function or fitness). Originally proposed as a metaphor to envision evolutionary processes and mutational interactions, the fitness landscape has recently transitioned from theoretical to empirical. This is due in part to advances in DNA synthesis and high-throughput sequencing. This allows for the construction and analysis of empirical fitness landscapes that encompass thousands of genotypes. These landscapes provide tractable insight into mutational pathways, the predictability of evolution or even the evolution of life. RNA enzymes (ribozymes) are an attractive model system for the construction of empirical fitness landscapes. Ribozymes function as both a genotype (primary RNA sequence) and a phenotype (catalytic function). To construct and characterize empirical RNA fitness landscapes, two high-throughput functional assays (self-cleavage and self-ligation), including a technique to improve data recovery from high-throughput sequencing using phased nucleotide inserts (Appendix A), were developed and implemented. Following fitness landscape construction, a stochastic evolutionary model was developed and employed based on the Wright-Fisher model. This model follows the principles of Darwinian evolution and allows a population to explore the fitness landscape by means of mutation and selection. These newly developed tools allowed for a novel approach to important evolutionary questions.

Chapter 1 explored the evolution of innovation at the intersection of two ribozyme functions: self-cleavage and self-ligation. Evolutionary innovations are qualitatively novel traits that emerge through evolution. Theories have suggested that innovations can occur where two genotype networks are in close proximity. However, only isolated examples of intersections have been investigated. The fitness landscape between the two ribozyme functions was explored by determining the ability of numerous neighboring RNA sequences to catalyze two different chemical reactions. This revealed that there was extensive functional overlap, and over half the genotypes can catalyze both functions to some extent. Data-driven evolutionary simulations found that these numerous points of intersection facilitated the discovery of a new function, yet the rate of optimization depended upon the starting location in the genotype network. This study constructed a fitness landscape where genotype networks intersect and uncovered the implications for evolutionary innovations.

Chapter 2 determined the effect of higher sequence space complexity and dimensionality on evolutionary adaptation in RNA fitness landscapes. The complexity and dimensionality of landscapes scale with the length of the RNA molecule. For this study, complexity was defined as the size of the genotype space and dimensionality as the number of edges connecting each genotype (node) to other genotypes that differ by a single mutation. Low-dimensional ‘direct’ landscapes consisting of only two possible nucleotides at various positions were compared to higher-dimensional ‘indirect’ landscapes that had all four nucleotides at the same positions. Indirect pathways contributed to the ruggedness and navigability of landscapes. Increased dimensionality in RNA fitness landscapes had the potential to circumvent fitness valleys, however indirect pathways also harbored stasis genotypes isolated by reciprocal sign epistasis.

Chapter 3 applied ancestral sequence resurrection and fitness landscape construction to naturally evolved ribozymes. The CPEB3 ribozyme is highly conserved in mammals and has been linked to episodic memory. By predicting, ‘resurrecting’ and functionally characterizing ancient gene sequences, hypotheses about gene function or selection can be empirically tested in an evolutionary context. Using the extant ribozyme sequences found in a range of mammalian species as a basis for inference of ancestral sequences, a phylogenetic fitness landscape was experimentally resurrected and reconstructed. A single high-activity ancestral sequence was found to be highly conserved and purifying selection is expected to have reduced the accumulation of mutations through geologic time. Many of the extant mammalian ribozyme sequences had high ribozyme activity, however a few had relatively low activity. Yet, given the local fitness landscape, a selective pressure for functional ribozyme sequences was seen. A single nucleotide polymorphism (SNP) found in humans, reduced co-transcriptional ribozyme activity in vitro and might alter our understanding of the CPEB3 ribozyme’s biological function.

Chapter 4 analyzed epistatic interactions in four published RNA fitness landscapes generated from high-throughput analyses. Two of the landscapes were assessed in vivo and two were assessed in vitro. Epistasis occurs when the effects of some mutations are dependent on the presence or absence of other mutations. The data allowed for an analysis of the distribution of fitness effects of individual mutations as well as combinations of two or more mutations. Two different approaches to measuring epistasis in the data both revealed a predominance of negative epistasis, such that higher combinations of two or more mutations are typically lower in fitness than expected from the effect of each individual mutation. This finding differed from studies using computationally predicted RNA but is similar to mutational experiments in protein enzymes.

The work presented here represents a significant contribution to our ability to construct and empirically characterize RNA fitness landscapes. The development of two high-throughput ribozyme assays opens the door for further empirical landscape construction. The implementation of data-driven stochastic evolutionary modeling allows for a clearer evolutionary characterization of the landscape. Understanding the connection between genotype and phenotype in RNA systems is important for designing RNA functions, improving in vitro selections and understanding the origins and evolution of new RNA functions (innovations). Applying these advances yielded valuable information about evolutionary innovations, the effects of higher dimensionality, evolution of extant ribozymes and the prevalence of epistasis in RNA fitness landscapes. Construction and analysis of empirical RNA fitness landscapes provides tractable insight into evolutionary processes, mutational pathways and the predictability of evolution.

DOI

10.18122/td/1429/boisestate

Share

COinS