**Introduction**

MutaRNA (Mutational Analysis of RNAs) predicts and visualizes the
structural aberration effects of mutations/variations in RNA sequences.
MutaRNA web server is an integrative, thermodynamics-based analysis of the structural impact of RNA mutations. Given an RNA in FASTA format along with a mutation, comparative and differential visualizations via dot and circular plots are presented to show the effect of mutations on the intramolecular base pairing patterns of RNAs. The base pair probabilities are computed with RNAplfold. Differential probabilities spot the changes caused mutations for even in the transcripts which is especially useful for long sequences posing complex structures.

# When using MutaRNA please cite :

- Yogita Sharma, Milad Miladi, Sandeep Dukare, Karine Boulay, Maiwen Caudron-Herger, Matthias Groß, Rolf Backofen, and Sven Diederichs

A pan-cancer analysis of synonymous mutations

Nature communications, 10, 2569, 2019 - Martin Raden, Syed M Ali, Omer S Alkhnbashi, Anke Busch, Fabrizio Costa, Jason A Davis, Florian Eggenhofer, Rick Gelhausen, Jens Georg, Steffen Heyne, Michael Hiller, Kousik Kundu, Robert Kleinkauf, Steffen C Lott, Mostafa M Mohamed, Alexander Mattheis, Milad Miladi, Andreas S Richter, Sebastian Will, Joachim Wolff, Patrick R Wright, and Rolf Backofen

Freiburg RNA tools: a central online resource for RNA-focused research and teaching

Nucleic Acids Research, 46(W1), W25-W29, 2018.

Results are computed with MutaRNA version 1.3.0 (using RNAplfold 2.4.14, remuRNA 1.0, RNAsnp 1.2)

**Overview**

The following parameters are used to control the execution of MutaRNA

Furthermore, additional information is available

# Sequence Parameters

## Wild type single RNA sequence in FASTA

Input RNA sequence wild type in FASTA format.

The parameter constraints are: The input has to be in valid FASTA format. The number of sequences has to be at least 1 and at most 1. Sequence lengths have to be in the range 7-2000. The allowed sequence alphabet is 'ACGUTacgut'.

## Mutation encoding

The mutant sequence is determined according to the
provided mutation encoding.

For single-nucleotide polymorphisms (SNPs), a SNP tag defines the single nucleotide mutation/variation to be analyzed and compared with the wild type. The format is a string of wildtype-base concatenated with the position-index followed by the mutant-base. For example 'C3G' defines a change from C at position 3 in WT to G. The position index is 1-based, i.e., the first base in the sequence has index 1.

For multi-nucleotide mutations, respective SNP tags can be joint via '-' dash symbols, e.g. 'C3G-A4G' encodes the mutations of two subsequent positions 3 and 4.

For single-nucleotide polymorphisms (SNPs), a SNP tag defines the single nucleotide mutation/variation to be analyzed and compared with the wild type. The format is a string of wildtype-base concatenated with the position-index followed by the mutant-base. For example 'C3G' defines a change from C at position 3 in WT to G. The position index is 1-based, i.e., the first base in the sequence has index 1.

For multi-nucleotide mutations, respective SNP tags can be joint via '-' dash symbols, e.g. 'C3G-A4G' encodes the mutations of two subsequent positions 3 and 4.

The parameter constraints are: SNP tag, exemplary form G33C, or dash-separated multi-nt mutation like C3G-A4G. Mutation has to change the nucleotide and has to correspond with wild type sequence.

# Folding options

## Window size (RNAplfold)

Length of the windows used to compute base pairing probabilities using RNAplfold. For sequences shorter than about 200 nucleotides, the full sequence length as window might be taken.

The parameter constraints are: Input value has to be parsable as Integer. The value must be greater than or equal to 10 and must be smaller than or equal to 2000. Not smaller than maximal basepair distance.

## Maximal base pair span (RNAplfold)

Maximal span of base pairs considered by RNAplfold for probability computation. For non-coding RNAs and sequences shorter than about 200 nucleotides, the full sequence length as base pair span is recommended. To have a reliable predictions, values larger than 500-800 are not recommended. For identifying mutation effect on cis regulatory elements of mRNAs, the default value of 150 combined with value 200 for window is recommended. A span longer than the window size would be ineffective.

The parameter constraints are: Input value has to be parsable as Integer. The value must be greater than or equal to 10 and must be smaller than or equal to 2000.

# Output Description

MutaRNA plots the comparative interaction potentials of an RNA
sequence in the form of base-pair probability predictions using
the thermodynamic models using RNAplfold tool.

Base pair probabilities of the wild type and mutant sequences are shown in form of matrix, the wild type on the top right for and the mutant sequence on the bottom left. The probability of a base pair is the chance of having that base pair occurring in the ensemble of structures.

Base pair probabilities of the wild type and mutant sequences are shown in form of circular Circos edges. The intensity of an edge corresponds to the probability of base-pairing between the two connected bases, such that a darker edge is more probable.

The absolute difference between the probabilities of the same base pairs in the wild type versus the mutant. The top right shows the subset of base pairs which their probabilities is decreased/weakened due to the mutation. The bottom shows the subset of base pairs which their probabilities is increased/strengthened due to the mutation.

In form of circular Circos plots, the subset of base pairs which their probabilities is decreased/weakened and increased/strengthened due to the mutation.

The difference between the wild type and the mutant probability of each possible base-pairing. The values are between -1 and 1, where a positive value means a higher probability for the wild type (i.e.

The predicted nucleotide accessibility (also known as unpaired probability) is is depicted for the wild type and mutant along the transcript. The difference between the predicted wild type and mutant accessibility is depicted in blue, where a positive value means a higher accessibility for the wild type and vice versa.

remuRNA computes the relative entropy

If you use remuRNA score, you may want to cite doi:10.1007/978-3-642-29627-7_25. remuRNA results are reported for default parameter setup (excluding present MutaRNA parameters). Assessment using RNAsnp:

Structural change in terms of maximum base-pair probability Euclidean

In

If you use RNAsnp scores, you may want to cite doi:10.1002/humu.22273. RNAsnp results are reported for default parameter setup (excluding present MutaRNA parameters).

## Base pairing of wild type vs. mutant

**Base pair probs matrix:**Base pair probabilities of the wild type and mutant sequences are shown in form of matrix, the wild type on the top right for and the mutant sequence on the bottom left. The probability of a base pair is the chance of having that base pair occurring in the ensemble of structures.

**Wild type and Mutant circular plots:**Base pair probabilities of the wild type and mutant sequences are shown in form of circular Circos edges. The intensity of an edge corresponds to the probability of base-pairing between the two connected bases, such that a darker edge is more probable.

## Differential comparison

**Base pair prob change:**The absolute difference between the probabilities of the same base pairs in the wild type versus the mutant. The top right shows the subset of base pairs which their probabilities is decreased/weakened due to the mutation. The bottom shows the subset of base pairs which their probabilities is increased/strengthened due to the mutation.

**Weakened and Increased:**In form of circular Circos plots, the subset of base pairs which their probabilities is decreased/weakened and increased/strengthened due to the mutation.

**Base pair prob difference:**The difference between the wild type and the mutant probability of each possible base-pairing. The values are between -1 and 1, where a positive value means a higher probability for the wild type (i.e.

`weakened`

) while a negative value means a higher probability for the mutant (i.e. `increased`

).**Accessibility profiles:**The predicted nucleotide accessibility (also known as unpaired probability) is is depicted for the wild type and mutant along the transcript. The difference between the predicted wild type and mutant accessibility is depicted in blue, where a positive value means a higher accessibility for the wild type and vice versa.

## Scoring of mutation

**Assessment using remuRNA:**remuRNA computes the relative entropy

`H(wt||mu)`

for the wild type versus mutant in the ensembles of structures. A higher value means there is a higher change in the distribution of structures, so a higher impact induced by the mutation. The minimum free energies of wild type `MFE(wt)`

and mutant `MFE(mu)`

and their difference `MFE delta`

are also computed. The relative is non negative and dimensionless. The energy entries are computed in scale of `kcal/mol`

.If you use remuRNA score, you may want to cite doi:10.1007/978-3-642-29627-7_25. remuRNA results are reported for default parameter setup (excluding present MutaRNA parameters). Assessment using RNAsnp:

Structural change in terms of maximum base-pair probability Euclidean

`distance`

for the subsequence `interval`

accommodating the largest change. RNAsnp computes statistical significance of the structure aberration due to the mutation in term of an empirical `p-value`

, using a background model of the mutations with similar GC context and length.In

`mode 1`

the base pair probabilities are calculated using in a global fold mode with RNAfold. In `mode 2`

the base pair probabilities are calculated using in a local fold mode with RNAplfold. Default parameters are used in both modes.If you use RNAsnp scores, you may want to cite doi:10.1002/humu.22273. RNAsnp results are reported for default parameter setup (excluding present MutaRNA parameters).

# Input Examples

## IRE mutation in FLT

Mutation G19U of the 5' UTR of ferritin light chain gene, which contains the iron response element (IRE); a long hairpin-stem structure. The mutation disrupts the IRE, which causes the loss of iron-level-based regulation of gene expression.

The example's result can be directly accessed here

## A30C in KRAS gene

Revisiting mutation A30C in KRAS gene, which is known to increase gene expression (Sharma et al., 2019). Note, here mutation with sequence context +-100nt, i.e. mutation is at position 101. The loss of the first hairpin close to the RBS might explain the increased translation efficiency.

The example's result can be directly accessed here

# List of Changes

- 4.8.2 - 200410 : MutaRNA v1.3 with extended output
- 4.8.1 - 200311 : MutaRNA v1.2 goes online
- 4.7.0 - 200128 : MutaRNA v1.1 goes online