Covariance models (CMs) are a very sensitive tool for finding non-coding RNA (ncRNA) genes in DNA sequence data. However, CMs are extremely slow. One reason why CMs are so slow is that they allow all possible combinations of insertions and deletions relative to the consensus model even though the vast majority of these are never seen in practice. In this paper we examine reduction in the number of states in covariance models. A simplified CM with reduced states which can be scored much faster is introduced. A comparison of the results of a full CM versus a reduced-state model found using a genetic algorithm is given for the let7 ncRNA family.
Smith, Jennifer A.. (2006). "Accelerated Non-Coding RNA Searches with Covariance Model Approximations". IEEE Congress on Evolutionary Computation, 2006, 2728-2733. http://dx.doi.org/10.1109/CEC.2006.1688650