The use of covariance models for non-coding RNA gene finding is extremely powerful and also extremely computationally demanding. A major reason for the high computational burden of this algorithm is that the search proceeds through every possible start position in the database and every possible sequence length between zero and a user-defined maximum length at every one of these start positions. Furthermore, for every start position and sequence length, all possible combinations of insertions and deletions leading to the given sequence length are searched. It has been previously shown that a large portion of this search space is nowhere near any database match observed in practice and that the search space can be limited significantly with little change in expected search results. In this work a different approach is taken in which the space of starting positions, sequence lengths, and insertion/deletion patterns is searched using a genetic algorithm.
This document was originally published by IEEE in Computational Intelligence and Bioinformatics and Computational Biology, 2006. Copyright restrictions may apply. DOI: 10.1109/CIBCB.2006.330953
Smith, Jennifer A.. (2006). "Covariance Searches for ncRNA Gene Finding". CIBCB '06. 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, 2006., . http://dx.doi.org/10.1109/CIBCB.2006.330953