Non-Coding RNA Covariance Model Combination Using Mixed Primary-Secondary Structure Alignment
Covariance models are very effective for finding new members of non-coding RNA sequence families in genomic data. However, the computation burden of applying CM-based search algorithms can be prohibitive. When annotating the genome of a newly sequenced organism it is usually desired to search the sequence data using a large number of ncRNA families. Computational burden can be reduced if the families are clustered into statistically similar models and a single cluster-average representative model produced. The database is then searched with the representative model for each cluster at a relatively low detection threshold. The output of this pre-filtered database is then processed with the individual family members of the cluster. A base-pair conflict metric has previously been proposed for use in model clustering. In this work an alternative metric using standard alignment algorithms and a special mixed primary-secondary structure scoring matrix is proposed.
Smith, Jennifer A.. (2013). "Non-Coding RNA Covariance Model Combination Using Mixed Primary-Secondary Structure Alignment". Computational Intelligence Methods for Bioinformatics and Biostatistics, 784595-104. http://dx.doi.org/10.1007/978-3-642-38342-7_9