- Martin Mann, Mostafa M Mohamed, Syed M Ali, and Rolf Backofen

Interactive implementations of thermodynamics-based RNA structure and RNA-RNA interaction prediction approaches for example-driven teaching

PLOS Computational Biology, 14 (8), e1006341, 2018. - Martin Raden, Syed M Ali, Omer S Alkhnbashi, Anke Busch, Fabrizio Costa, Jason A Davis, Florian Eggenhofer, Rick Gelhausen, Jens Georg, Steffen Heyne, Michael Hiller, Kousik Kundu, Robert Kleinkauf, Steffen C Lott, Mostafa M Mohamed, Alexander Mattheis, Milad Miladi, Andreas S Richter, Sebastian Will, Joachim Wolff, Patrick R Wright, and Rolf Backofen

Freiburg RNA tools: a central online resource for RNA-focused research and teaching

Nucleic Acids Research, 46(W1), W25-W29, 2018.

# Teaching - accessibility : of interacting sites source at github@BackofenLab/RNA-Playground

State-of-the-art RNA-RNA interaction prediction algorithms take
the accessibility of the interacting regions into account. That is,
a penalty corrects the interaction scoring for how
much the interaction sites are involved in intramolecular base pairs.
This can be expressed by reverting unpaired probabilities $P^{u}_{i,j}$ into
pseudo energy scores via $-RT\log(P^{u}_{i,j})$ that represent the amount
of energy (within the structure ensemble) to unfold a region $i..j$
in order to make it accessible for intermolecular RNA-RNA interactions
(Ulrike Mückstein et al., 2006).

Here, we extend the hybrid-only approach towards the integration of an accessibility-based scoring. To this end, the simplified McCaskill approach is used to compute unpaired probabilities $P^{u}$. The penalties are used in a post-processing step to correct the hybridization energies in order to identify the interaction that optimizes the combination of hybridization and accessibility scoring.

Here, we extend the hybrid-only approach towards the integration of an accessibility-based scoring. To this end, the simplified McCaskill approach is used to compute unpaired probabilities $P^{u}$. The penalties are used in a post-processing step to correct the hybridization energies in order to identify the interaction that optimizes the combination of hybridization and accessibility scoring.

RNA sequence $S^1$:

RNA sequence $S^2$:

(Computation uses reversed sequence $\overleftarrow{S^2}$)

(Computation uses reversed sequence $\overleftarrow{S^2}$)

Minimal loop length $l$:

Energy weight of base pair $E_{bp}$:

'Normalized' temperature $RT$:

The following recursions are used to compute the interactions that
show an optimal combination of hybridization and accessibility.
To this end, first the maximal number of base pairs $D^{i,k}_{j,l}$ for all
interaction sites are computed using the
hybrid-only approach.
Furthermore, the unpaired probabilities $P^{u1}$ and $P^{u2}$ are
tabularized for both sequences $S^1$ and $\overleftarrow{S^2}$, resp., using the
simplified McCaskill approach.

Given these values, the accessibility-incorporating interaction scorings are computed and stored in table $I$. A non-zero entry $I^{i,k}_{j,l}$ represents the combined scoring for an interaction of $S^1_{i..k}$ with $\overleftarrow{S^2_{j..l}}$ with left/right most base pairs $(S^1_i,\overleftarrow{S^2_j})$/$(S^1_k,\overleftarrow{S^2_l})$, respectively.

Given these values, the accessibility-incorporating interaction scorings are computed and stored in table $I$. A non-zero entry $I^{i,k}_{j,l}$ represents the combined scoring for an interaction of $S^1_{i..k}$ with $\overleftarrow{S^2_{j..l}}$ with left/right most base pairs $(S^1_i,\overleftarrow{S^2_j})$/$(S^1_k,\overleftarrow{S^2_l})$, respectively.

## Visualization of interacting base pairs (selected structure)

Due to the four-dimensionality of $I$, we only list the optimal
hybrid structures (up to 15). On selection, the intermolecular base pairs are
visualized. If all entries $I^{i,k}_{j,l}$ show negative scores,
no favorable interaction is possible since intramolecular base pairs
dominate the individual structures (too hight penalties for
possible interaction sites). Therefore, the structure list might be
empty.

Possible Structures |
---|

Energy of selection:

The box provides an ASCII representation of the interacting
base pairs of the selected structure with $S^{1}$ on top and $S^{2}$
on the bottom.
Note, sequence $S^{2}$ is reversed (running from right ($5'$) to left
($3'$)) within this representation.
Note further, if no interacting
base pairs are present, no visualization is done.

## Accessibility

The simplified McCaskill approach
is used in order to compute the unpaired probabilities $P^{u}$ for each of
the two sequences. Please refer to the according page for details.
These are used to compute according energy penalties $ED$ to make an
interacting site $i..j$ accessible, i.e. $ED_{i,j} = -RT\cdot\log(P^{u}_{i,j})$.
In the following, the penalties for $S^1$ and $\overleftarrow{S^2}$ are
visualized in dotplot format, i.e. the larger the dot the higher is
the penalty for the respective subsequence to form an interaction.

Penalties $ED^{1}$ to make site in $S^{1}$ accessible:

Penalties $ED^{2}$ to make site in $\overleftarrow{S^2}$ accessible: