Freiburg RNA Tools
Teaching RNA algorithms

Teaching RNA algorithms

In order to enable an example-driven learning and teaching of RNA structure related algorithms, we provide here Javascript-based implementations for various algorithms for RNA structure and RNA-RNA interaction prediction. To reduce the level of complexity, all algorithms use a simple Nussinov-like energy scoring scheme, i.e. the energy of an RNA structure is directly related to its number of base pairs without further distinction.

RNA structure

Using an adapted umambigous version of the Nussinov algorithm it is possible to count the number of all possible nested secondary structures an RNA molecule can form.

The Nussinov algorithm enables the efficient computation of the structure with the maximal number of base pairs for a given RNA sequence.

The McCaskill algorithm enables the efficient computation of RNA structure probabilities as well as probabilities that a certain base pair is formed. Furthermore, unpaired probabilities for subsequences can be computed that reflect the accessibility of RNA parts for other interactions.

The maximum expected accuracy (MEA) algorithm uses base pair probabilities as well as unpaired probabilites (e.g. computed via the McCaskill algorithm) to find the structure that is 'maximally accurate' in its structural elements.

RNA-RNA interaction

RNA-RNA interactions can be predicted by linking the two interacting RNAs into one pseudo-RNA and use a variant of the Nussinov algorithm to predict the fused structure. Such approaches are called 'co-folding'.

Given two RNA molecules, the RNA-RNA interaction with the maximal number of intermolecular base pairs can be predicted. Such algorithms ignore the competition between intra- and intermolecular base pair formation, e.g. considered in 'co-folding' or 'accessibilty-based' approaches.

Since a nucleotide can form at most one base pair, there is a competition between intra- and intermolecular base pair formation. This can be incorporated into RNA-RNA interaction prediction by also scoring the probability that an interacting subsequence is not involved in intramolecular base pairs, i.e. it is accessible.